如何在 select 子句中执行 Postgreql 子查询,以及如何在 SQLServer 子句中执行 join in?

我正在试着写一个关于 postgreql 的查询语句:

select name, author_id, count(1),
(select count(1)
from names as n2
where n2.id = n1.id
and t2.author_id = t1.author_id
)
from names as n1
group by name, author_id

这当然对 Microsoft SQL Server 有效,但对 postestresql 则完全没有效果。我读了一下它的文档,似乎可以改写为:

select name, author_id, count(1), total
from names as n1, (select count(1) as total
from names as n2
where n2.id = n1.id
and n2.author_id = t1.author_id
) as total
group by name, author_id

但是,这将返回 postegresql 上的以下错误: “ FROM 中的子查询不能引用相同查询级别的其他关系”。所以我被困住了。有人知道我是怎么做到的吗?

谢谢

383148 次浏览

我不确定我是否完全理解你的意图,但也许下面这些就是你想要的:

select n1.name, n1.author_id, count_1, total_count
from (select id, name, author_id, count(1) as count_1
from names
group by id, name, author_id) n1
inner join (select id, author_id, count(1) as total_count
from names
group by id, author_id) n2
on (n2.id = n1.id and n2.author_id = n1.author_id)

不幸的是,这增加了按 id 以及 name 和 author _ id 对第一个子查询进行分组的要求,我认为这是不需要的。不过,我不确定如何解决这个问题,因为您需要在第二个子查询中加入可用的 id。也许其他人会想出更好的解决办法。

我只是在这里回答与格式化版本的最终 sql,我需要的基础上鲍勃贾维斯的回答,在我上面的评论:

select n1.name, n1.author_id, cast(count_1 as numeric)/total_count
from (select id, name, author_id, count(1) as count_1
from names
group by id, name, author_id) n1
inner join (select author_id, count(1) as total_count
from names
group by author_id) n2
on (n2.author_id = n1.author_id)

我知道这是旧的,但是因为 后遗症9.3有一个选项可以使用关键字“ LATERAL”在 JOINS 中使用 RELATED 子查询,所以来自问题的查询看起来像:

SELECT
name, author_id, count(*), t.total
FROM
names as n1
INNER JOIN LATERAL (
SELECT
count(*) as total
FROM
names as n2
WHERE
n2.id = n1.id
AND n2.author_id = n1.author_id
) as t ON 1=1
GROUP BY
n1.name, n1.author_id

Complementing @ Bob Jarvis and @ dmikam answer, Postgres don't perform a good plan when you don't use LATERAL, below a simulation, in both cases the query data results are the same, but the cost are very different

表格结构

CREATE TABLE ITEMS (
N INTEGER NOT NULL,
S TEXT NOT NULL
);


INSERT INTO ITEMS
SELECT
(random()*1000000)::integer AS n,
md5(random()::text) AS s
FROM
generate_series(1,1000000);


CREATE INDEX N_INDEX ON ITEMS(N);

在没有 LATERAL的子查询中使用 GROUP BY执行 JOIN

EXPLAIN
SELECT
I.*
FROM ITEMS I
INNER JOIN (
SELECT
COUNT(1), n
FROM ITEMS
GROUP BY N
) I2 ON I2.N = I.N
WHERE I.N IN (243477, 997947);

The results

Merge Join  (cost=0.87..637500.40 rows=23 width=37)
Merge Cond: (i.n = items.n)
->  Index Scan using n_index on items i  (cost=0.43..101.28 rows=23 width=37)
Index Cond: (n = ANY ('{243477,997947}'::integer[]))
->  GroupAggregate  (cost=0.43..626631.11 rows=861418 width=12)
Group Key: items.n
->  Index Only Scan using n_index on items  (cost=0.43..593016.93 rows=10000000 width=4)

使用 LATERAL

EXPLAIN
SELECT
I.*
FROM ITEMS I
INNER JOIN LATERAL (
SELECT
COUNT(1), n
FROM ITEMS
WHERE N = I.N
GROUP BY N
) I2 ON 1=1 --I2.N = I.N
WHERE I.N IN (243477, 997947);

结果

Nested Loop  (cost=9.49..1319.97 rows=276 width=37)
->  Bitmap Heap Scan on items i  (cost=9.06..100.20 rows=23 width=37)
Recheck Cond: (n = ANY ('{243477,997947}'::integer[]))
->  Bitmap Index Scan on n_index  (cost=0.00..9.05 rows=23 width=0)
Index Cond: (n = ANY ('{243477,997947}'::integer[]))
->  GroupAggregate  (cost=0.43..52.79 rows=12 width=12)
Group Key: items.n
->  Index Only Scan using n_index on items  (cost=0.43..52.64 rows=12 width=4)
Index Cond: (n = i.n)

我的 Postgres 版本是 PostgreSQL 10.3 (Debian 10.3-1.pgdg90+1)

select n1.name, n1.author_id, cast(count_1 as numeric)/total_count
from (select id, name, author_id, count(1) as count_1
from names
group by id, name, author_id) n1
inner join (select distinct(author_id), count(1) as total_count
from names) n2
on (n2.author_id = n1.author_id)
Where true

使用 distinct如果有更多的内部连接,因为更多的连接组性能较慢