Postgres 为连接表的 array_agg 返回[ null ]而不是[]

我在 Postgres 选择了一些对象和它们的标签,这个模式相当简单,有三个表:

物件 id

标签 id | object_id | tag_id

标签 id | tag

我像这样连接这些表,使用 array_agg将标记聚合到一个字段中:

SELECT objects.*,
array_agg(tags.tag) AS tags,
FROM objects
LEFT JOIN taggings ON objects.id = taggings.object_id
LEFT JOIN tags ON tags.id = taggings.tag_id

但是,如果对象没有标记,Postgres 返回:

[ null ]

而不是一个空数组。 当没有标记时,我如何返回一个空数组?我已经双重检查,我没有一个空标记被返回。

聚合文件说“聚合函数可以在必要时用0或空数组替换 null”。我试过 COALESCE(ARRAY_AGG(tags.tag)) as tags,但它仍然返回一个带有 null 的数组。我已经尝试过使用第二个参数做很多事情(比如 COALESCE(ARRAY_AGG(tags.tag), ARRAY()),但是它们都会导致语法错误。

45216 次浏览

The documentation says that an array containing NULL is returned. If you want to convert that to an empty array, then you need to do some minor magic:

SELECT objects.id,
CASE WHEN length((array_agg(tags.tag))[1]) > 0
THEN array_agg(tags.tag)
ELSE ARRAY[]::text[] END AS tags
FROM objects
LEFT JOIN taggings ON objects.id = taggings.object_id
LEFT JOIN tags ON tags.id = taggings.tag_id
GROUP BY 1;

This assumes that the tags are of text type (or any of its variants); modify the cast as required.

The trick here is that the first (and only) element in a [NULL] array has a length of 0, so if any data is returned from tags you return the aggregate, otherwise construct an empty array of the right type.

Incidentally, the statement in the documentation about using coalesce() is a bit crummy: what is meant is that if you do not want NULL as a result, you can use coalesce() to turn that into a 0 or some other output of your choosing. But you need to apply that to the array elements instead of the array, which, in your case, would not provide a solution.

The docs say that when you are aggregating zero rows, then you get a null value, and the note about using COALESCE is addressing this specific case.

This does not apply to your query, because of the way a LEFT JOIN behaves - when it finds zero matching rows, it returns one row, filled with nulls (and the aggregate of one null row is an array with one null element).

You might be tempted to blindly replace [NULL] with [] in the output, but then you lose the ability to distiguish between objects with no tags and tagged objects where tags.tag is null. Your application logic and/or integrity constraints may not allow this second case, but that's all the more reason not to suppress a null tag if it does manage to sneak in.

You can identify an object with no tags (or in general, tell when a LEFT JOIN found no matches) by checking whether the field on the other side of the join condition is null. So in your case, just replace

array_agg(tags.tag)

with

CASE
WHEN taggings.object_id IS NULL
THEN ARRAY[]::text[]
ELSE array_agg(tags.tag)
END

Another option might be array_remove(..., NULL) (introduced in 9.3) if tags.tag is NOT NULL (otherwise you might want to keep NULL values in the array, but in that case, you can't distinguish between a single existing NULL tag and a NULL tag due to the LEFT JOIN):

SELECT objects.*,
array_remove(array_agg(tags.tag), NULL) AS tags,
FROM objects
LEFT JOIN taggings ON objects.id = taggings.object_id
LEFT JOIN tags ON tags.id = taggings.tag_id

If no tags are found, an empty array is returned.

Since 9.4 one can restrict an aggregate function call to proceed only rows that match a certain criterion: array_agg(tags.tag) filter (where tags.tag is not null)

Perhaps this answer is a little late, but I wanted to share with you that another querying strategy is possible as well: performing the aggregation in a a separate (common) table expression.

WITH cte_tags AS (
SELECT
taggings.object_id,
array_agg(tags.tag) AS tags
FROM
taggings
INNER JOIN tags ON tags.id = taggings.tag_id
GROUP BY
taggings.object_id
)
SELECT
objects.*,
cte_tags.tags
FROM
objects
LEFT JOIN cte_tags ON cte_tags.object_id = objects.id

Instead of an array with a single element of NULL, you will now get NULL instead of an array.

If you really want an empty array instead of NULL in your results, you can use the COALESCE function...:

WITH cte_tags AS (
SELECT
taggings.object_id,
array_agg(tags.tag) AS tags
FROM
taggings
INNER JOIN tags ON tags.id = taggings.tag_id
GROUP BY
taggings.object_id
)
SELECT
objects.*,
COALESCE(cte_tags.tags, '{}') AS tags
FROM
objects
LEFT JOIN cte_tags ON cte_tags.object_id = objects.id

...or use array-to-array concatenation:

WITH cte_tags AS (
SELECT
taggings.object_id,
array_agg(tags.tag) AS tags
FROM
taggings
INNER JOIN tags ON tags.id = taggings.tag_id
GROUP BY
taggings.object_id
)
SELECT
objects.*,
cte_tags.tags || '{}' AS tags
FROM
objects
LEFT JOIN cte_tags ON cte_tags.object_id = objects.id

What about this:

COALESCE(NULLIF(array_agg(tags.tag), '{NULL}'), '{}') AS tags,

Seems to work.

I replaced

array_to_json(array_agg(col_name))

with

array_to_json(coalesce(array_agg(col_name), ARRAY[]::record[]))

so that instead of returning a null JSON value I got an empty JSON array