返回不同的值

我试图在 Oracle 中使用 LISTAGG函数。我希望只获取该列的不同值。有没有一种方法可以不用创建函数或过程而只获得不同的值?

col1  col2 Created_by
1     2     Smith
1     2     John
1     3     Ajay
1     4     Ram
1     5     Jack

我需要选择 col1和 col2的 LISTAGG(不考虑第3列)。当我这样做时,我得到类似于 LISTAGG: [2,2,3,4,5]的结果

我需要在这里删除重复的‘2’; 我只需要 col2和 col1的不同值。

355192 次浏览

19c 及以后:

select listagg(distinct the_column, ',') within group (order by the_column)
from the_table

18c 及以上:

select listagg(the_column, ',') within group (order by the_column)
from (
select distinct the_column
from the_table
) t

如果你需要更多的专栏,这样的东西可能就是你正在寻找的:

select col1, listagg(col2, ',') within group (order by col2)
from (
select col1,
col2,
row_number() over (partition by col1, col2 order by col1) as rn
from foo
order by col1,col2
)
where rn = 1
group by col1;

可以使用未记录的 wm_concat函数。

select col1, wm_concat(distinct col2) col2_list
from tab1
group by col1;

这个函数返回 clob 列,如果需要,可以使用 dbms_lob.substr将 clob 转换为 varchar2。

如果目的是将此转换应用于多个列,那么我已经扩展了 _ horse _ with _ no _ name 的解决方案:

SELECT * FROM
(SELECT LISTAGG(GRADE_LEVEL, ',') within group(order by GRADE_LEVEL) "Grade Levels" FROM (select distinct GRADE_LEVEL FROM Students) t)                     t1,
(SELECT LISTAGG(ENROLL_STATUS, ',') within group(order by ENROLL_STATUS) "Enrollment Status" FROM (select distinct ENROLL_STATUS FROM Students) t)          t2,
(SELECT LISTAGG(GENDER, ',') within group(order by GENDER) "Legal Gender Code" FROM (select distinct GENDER FROM Students) t)                               t3,
(SELECT LISTAGG(CITY, ',') within group(order by CITY) "City" FROM (select distinct CITY FROM Students) t)                                                  t4,
(SELECT LISTAGG(ENTRYCODE, ',') within group(order by ENTRYCODE) "Entry Code" FROM (select distinct ENTRYCODE FROM Students) t)                             t5,
(SELECT LISTAGG(EXITCODE, ',') within group(order by EXITCODE) "Exit Code" FROM (select distinct EXITCODE FROM Students) t)                                 t6,
(SELECT LISTAGG(LUNCHSTATUS, ',') within group(order by LUNCHSTATUS) "Lunch Status" FROM (select distinct LUNCHSTATUS FROM Students) t)                     t7,
(SELECT LISTAGG(ETHNICITY, ',') within group(order by ETHNICITY) "Race Code" FROM (select distinct ETHNICITY FROM Students) t)                              t8,
(SELECT LISTAGG(CLASSOF, ',') within group(order by CLASSOF) "Expected Graduation Year" FROM (select distinct CLASSOF FROM Students) t)                     t9,
(SELECT LISTAGG(TRACK, ',') within group(order by TRACK) "Track Code" FROM (select distinct TRACK FROM Students) t)                                         t10,
(SELECT LISTAGG(GRADREQSETID, ',') within group(order by GRADREQSETID) "Graduation ID" FROM (select distinct GRADREQSETID FROM Students) t)                 t11,
(SELECT LISTAGG(ENROLLMENT_SCHOOLID, ',') within group(order by ENROLLMENT_SCHOOLID) "School Key" FROM (select distinct ENROLLMENT_SCHOOLID FROM Students) t)       t12,
(SELECT LISTAGG(FEDETHNICITY, ',') within group(order by FEDETHNICITY) "Federal Race Code" FROM (select distinct FEDETHNICITY FROM Students) t)                         t13,
(SELECT LISTAGG(SUMMERSCHOOLID, ',') within group(order by SUMMERSCHOOLID) "Summer School Key" FROM (select distinct SUMMERSCHOOLID FROM Students) t)                               t14,
(SELECT LISTAGG(FEDRACEDECLINE, ',') within group(order by FEDRACEDECLINE) "Student Decl to Prov Race Code" FROM (select distinct FEDRACEDECLINE FROM Students) t)          t15

这是 Oracle 数据库11g 企业版版本11.2.0.2.0-64位产品。
我无法使用 STRAGG,因为没有办法区分和命令。

性能扩展是线性的,这很好,因为我添加了所有感兴趣的列。上面的77K 行花了3秒钟。仅仅一卷,0.172秒。有一种方法可以一次性区分一个表中的多个列。

从 Oracle 19C 中可以看到 给你

从18C 和更早的组内尝试见 here

否则使用正则表达式

Here's how to solve your issue.

select
regexp_replace(
'2,2,2.1,3,3,3,3,4,4'
,'([^,]+)(,\1)*(,|$)', '\1\3')


from dual

报税表

二,二,一,三,四

答案如下:

select col1,


regexp_replace(
listagg(
col2 , ',') within group (order by col2)  -- sorted
,'([^,]+)(,\1)*(,|$)', '\1\3') )
from tableX
where rn = 1
group by col1;

注意: 以上将在大多数情况下工作-列表应排序,您可能必须削减所有的尾随和前导空间取决于您的数据。

如果你有一组大于20或大于20的字符串,你可能会遇到 Oracle 字符串大小限制“字符串连接的结果太长”。

在 oracle12cR2中,可以抑制此错误,请参见 给你。或者在每个组的成员上设置一个最大值。这只有在只列出第一个成员的情况下才会有效。如果你有非常长的变量字符串,这可能不工作。你必须做实验。

select col1,


case
when count(col2) < 100 then
regexp_replace(
listagg(col2, ',') within group (order by col2)
,'([^,]+)(,\1)*(,|$)', '\1\3')
 

else
'Too many entries to list...'
end
    

from sometable
where rn = 1
group by col1;

另一个解决方案(不那么简单)希望避免 Oracle 字符串大小限制-字符串大小限制为4000。感谢 user3465996的这篇文章

select col1  ,
dbms_xmlgen.convert(  -- HTML decode
dbms_lob.substr( -- limit size to 4000 chars
ltrim( -- remove leading commas
REGEXP_REPLACE(REPLACE(
REPLACE(
XMLAGG(
XMLELEMENT("A",col2 )
ORDER BY col2).getClobVal(),
'<A>',','),
'</A>',''),'([^,]+)(,\1)*(,|$)', '\1\3'),
','), -- remove leading XML commas ltrim
4000,1) -- limit to 4000 string size
, 1)  -- HTML.decode
as col2
from sometable
where rn = 1
group by col1;

V1-一些测试用例-仅供参考

regexp_replace('2,2,2.1,3,3,4,4','([^,]+)(,\1)+', '\1')
-> 2.1,3,4 Fail
regexp_replace('2 ,2 ,2.1,3 ,3 ,4 ,4 ','([^,]+)(,\1)+', '\1')
-> 2 ,2.1,3,4 Success  - fixed length items

V2-项目中包含的项目,例如2、21

regexp_replace('2.1,1','([^,]+)(,\1)+', '\1')
-> 2.1 Fail
regexp_replace('2 ,2 ,2.1,1 ,3 ,4 ,4 ','(^|,)(.+)(,\2)+', '\1\2')
-> 2 ,2.1,1 ,3 ,4  -- success - NEW regex
regexp_replace('a,b,b,b,b,c','(^|,)(.+)(,\2)+', '\1\2')
-> a,b,b,c fail!

V3-regex 感谢 Igor! 适用于所有情况。

select
regexp_replace('2,2,2.1,3,3,4,4','([^,]+)(,\1)*(,|$)', '\1\3') ,
---> 2,2.1,3,4 works
regexp_replace('2.1,1','([^,]+)(,\1)*(,|$)', '\1\3'),
--> 2.1,1 works
regexp_replace('a,b,b,b,b,c','([^,]+)(,\1)*(,|$)', '\1\3')
---> a,b,c works


from dual

有人想过使用 PARTITION BY 子句吗?在这个查询中,我可以获得应用程序服务和访问的列表。

SELECT DISTINCT T.APP_SVC_ID,
LISTAGG(RTRIM(T.ACCESS_MODE), ',') WITHIN GROUP(ORDER BY T.ACCESS_MODE) OVER(PARTITION BY T.APP_SVC_ID) AS ACCESS_MODE
FROM APP_SVC_ACCESS_CNTL T
GROUP BY T.ACCESS_MODE, T.APP_SVC_ID

我不得不删除保密协议中的“地点”条款但你懂的。

我克服了这个问题,首先对值进行分组,然后使用 listagg 进行另一次聚合:

select a,b,listagg(c,',') within group(order by c) c, avg(d)
from (select a,b,c,avg(d)
from   table
group by (a,b,c))
group by (a,b)

只有一个完整的表访问,相对容易扩展到更复杂的查询

If you want distinct values across MULTIPLE columns, want control over sort order, don't want to use an undocumented function that may disappear, and do not want more than one full table scan, you may find this construct useful:

with test_data as
(
select 'A' as col1, 'T_a1' as col2, '123' as col3 from dual
union select 'A', 'T_a1', '456' from dual
union select 'A', 'T_a1', '789' from dual
union select 'A', 'T_a2', '123' from dual
union select 'A', 'T_a2', '456' from dual
union select 'A', 'T_a2', '111' from dual
union select 'A', 'T_a3', '999' from dual
union select 'B', 'T_a1', '123' from dual
union select 'B', 'T_b1', '740' from dual
union select 'B', 'T_b1', '846' from dual
)
select col1
, (select listagg(column_value, ',') within group (order by column_value desc) from table(collect_col2)) as col2s
, (select listagg(column_value, ',') within group (order by column_value desc) from table(collect_col3)) as col3s
from
(
select col1
, collect(distinct col2) as collect_col2
, collect(distinct col3) as collect_col3
from test_data
group by col1
);

我认为这会有所帮助——如果列值是重复的,则将其大小写为 NULL ——那么它就不会被追加到 LISTAGG 字符串中:

with test_data as
(
select 1 as col1, 2 as col2, 'Smith' as created_by from dual
union select 1, 2, 'John' from dual
union select 1, 3, 'Ajay' from dual
union select 1, 4, 'Ram' from dual
union select 1, 5, 'Jack' from dual
union select 2, 5, 'Smith' from dual
union select 2, 6, 'John' from dual
union select 2, 6, 'Ajay' from dual
union select 2, 6, 'Ram' from dual
union select 2, 7, 'Jack' from dual
)
SELECT col1  ,
listagg(col2 , ',') within group (order by col2 ASC) AS orig_value,
listagg(CASE WHEN rwn=1 THEN col2 END , ',') within group (order by col2 ASC) AS distinct_value
from
(
select row_number() over (partition by col1,col2 order by 1) as rwn,
a.*
from test_data a
) a
GROUP BY col1

结果:

COL1  ORIG         DISTINCT
1   2,2,3,4,5   2,3,4,5
2   5,6,6,6,7   5,6,7

listagg() ignores NULL values, so in a first step you could use the lag() function to analyse whether the previous record had the same value, if yes then NULL, else 'new value'.

WITH tab AS
(
SELECT 1 as col1, 2 as col2, 'Smith' as created_by FROM dual
UNION ALL SELECT 1 as col1, 2 as col2, 'John'  as created_by FROM dual
UNION ALL SELECT 1 as col1, 3 as col2, 'Ajay'  as created_by FROM dual
UNION ALL SELECT 1 as col1, 4 as col2, 'Ram'   as created_by FROM dual
UNION ALL SELECT 1 as col1, 5 as col2, 'Jack'  as created_by FROM dual
)
SELECT col1
, CASE
WHEN lag(col2) OVER (ORDER BY col2) = col2 THEN
NULL
ELSE
col2
END as col2_with_nulls
, created_by
FROM tab;

结果

      COL1 COL2_WITH_NULLS CREAT
---------- --------------- -----
1               2 Smith
1                 John
1               3 Ajay
1               4 Ram
1               5 Jack

Note that the second 2 is replaced by NULL. Now you can wrap a SELECT with the listagg() around it.

WITH tab AS
(
SELECT 1 as col1, 2 as col2, 'Smith' as created_by FROM dual
UNION ALL SELECT 1 as col1, 2 as col2, 'John'  as created_by FROM dual
UNION ALL SELECT 1 as col1, 3 as col2, 'Ajay'  as created_by FROM dual
UNION ALL SELECT 1 as col1, 4 as col2, 'Ram'   as created_by FROM dual
UNION ALL SELECT 1 as col1, 5 as col2, 'Jack'  as created_by FROM dual
)
SELECT listagg(col2_with_nulls, ',') WITHIN GROUP (ORDER BY col2_with_nulls) col2_list
FROM ( SELECT col1
, CASE WHEN lag(col2) OVER (ORDER BY col2) = col2 THEN NULL ELSE col2 END as col2_with_nulls
, created_by
FROM tab );

结果

COL2_LIST
---------
2,3,4,5

您也可以在多个列上执行此操作。

WITH tab AS
(
SELECT 1 as col1, 2 as col2, 'Smith' as created_by FROM dual
UNION ALL SELECT 1 as col1, 2 as col2, 'John'  as created_by FROM dual
UNION ALL SELECT 1 as col1, 3 as col2, 'Ajay'  as created_by FROM dual
UNION ALL SELECT 1 as col1, 4 as col2, 'Ram'   as created_by FROM dual
UNION ALL SELECT 1 as col1, 5 as col2, 'Jack'  as created_by FROM dual
)
SELECT listagg(col1_with_nulls, ',') WITHIN GROUP (ORDER BY col1_with_nulls) col1_list
, listagg(col2_with_nulls, ',') WITHIN GROUP (ORDER BY col2_with_nulls) col2_list
, listagg(created_by, ',')      WITHIN GROUP (ORDER BY created_by) created_by_list
FROM ( SELECT CASE WHEN lag(col1) OVER (ORDER BY col1) = col1 THEN NULL ELSE col1 END as col1_with_nulls
, CASE WHEN lag(col2) OVER (ORDER BY col2) = col2 THEN NULL ELSE col2 END as col2_with_nulls
, created_by
FROM tab );

结果

COL1_LIST COL2_LIST CREATED_BY_LIST
--------- --------- -------------------------
1         2,3,4,5   Ajay,Jack,John,Ram,Smith

为了解决字符串长度问题,您可以使用与 listagg类似的 XMLAGG,但它返回一个 clob。

然后,您可以使用 regexp_replace解析并获得惟一值,然后使用 dbms_lob.substr()将其转换回字符串。如果您有大量不同的值,您仍然会用这种方式耗尽空间,但是在很多情况下,下面的代码应该可以工作。

还可以更改所使用的分隔符。在我的例子中,我希望使用-’而不是’,但是您应该能够替换我的代码中的破折号,并且如果需要的话可以使用逗号。

select col1,
dbms_lob.substr(ltrim(REGEXP_REPLACE(REPLACE(
REPLACE(
XMLAGG(
XMLELEMENT("A",col2)
ORDER BY col2).getClobVal(),
'<A>','-'),
'</A>',''),'([^-]*)(-\1)+($|-)',
'\1\3'),'-'), 4000,1) as platform_mix
from table

如何创建一个专用函数,使其成为“独特”部分:

create or replace function listagg_distinct (t in str_t, sep IN VARCHAR2 DEFAULT ',')
return VARCHAR2
as
l_rc VARCHAR2(4096) := '';
begin
SELECT listagg(val, sep) WITHIN GROUP (ORDER BY 1)
INTO l_rc
FROM (SELECT DISTINCT column_value val FROM table(t));
RETURN l_rc;
end;
/

然后用它来进行聚合:

SELECT col1, listagg_distinct(cast(collect(col_2) as str_t ), ', ')
FROM your_table
GROUP BY col_1;

你可以通过替换正则表达式来做到这一点,下面是一个例子:

-- Citations Per Year - Cited Publications main query. Includes list of unique associated core project numbers, ordered by core project number.
SELECT ptc.pmid AS pmid, ptc.pmc_id, ptc.pub_title AS pubtitle, ptc.author_list AS authorlist,
ptc.pub_date AS pubdate,
REGEXP_REPLACE( LISTAGG ( ppcc.admin_phs_org_code ||
TO_CHAR(ppcc.serial_num,'FM000000'), ',') WITHIN GROUP (ORDER BY ppcc.admin_phs_org_code ||
TO_CHAR(ppcc.serial_num,'FM000000')),
'(^|,)(.+)(,\2)+', '\1\2')
AS projectNum
FROM publication_total_citations ptc
JOIN proj_paper_citation_counts ppcc
ON ptc.pmid = ppcc.pmid
AND ppcc.citation_year = 2013
JOIN user_appls ua
ON ppcc.admin_phs_org_code = ua.admin_phs_org_code
AND ppcc.serial_num = ua.serial_num
AND ua.login_id = 'EVANSF'
GROUP BY ptc.pmid, ptc.pmc_id, ptc.pub_title, ptc.author_list, ptc.pub_date
ORDER BY pmid;

同样发布在这里: Oracle - unique Listagg values

使用如下创建的 listagg _ clob 函数:

create or replace package list_const_p
is
list_sep varchar2(10) := ',';
end list_const_p;
/
sho err


create type listagg_clob_t as object(
v_liststring varchar2(32767),
v_clob clob,
v_templob number,


static function ODCIAggregateInitialize(
sctx IN OUT listagg_clob_t
) return number,
member function ODCIAggregateIterate(
self IN OUT listagg_clob_t, value IN varchar2
) return number,
member function ODCIAggregateTerminate(
self IN OUT listagg_clob_t, returnValue OUT clob, flags IN number
) return number,
member function ODCIAggregateMerge(
self IN OUT listagg_clob_t, ctx2 IN OUT listagg_clob_t
) return number
);
/
sho err


create or replace type body listagg_clob_t is


static function ODCIAggregateInitialize(sctx IN OUT listagg_clob_t)
return number is
begin
sctx := listagg_clob_t('', '', 0);
return ODCIConst.Success;
end;


member function ODCIAggregateIterate(
self IN OUT listagg_clob_t,
value IN varchar2
) return number is
begin
if nvl(lengthb(v_liststring),0) + nvl(lengthb(value),0) <= 4000 then
self.v_liststring:=self.v_liststring || value || list_const_p.list_sep;
else
if self.v_templob = 0 then
dbms_lob.createtemporary(self.v_clob, true, dbms_lob.call);
self.v_templob := 1;
end if;
dbms_lob.writeappend(self.v_clob, length(self.v_liststring), v_liststring);
self.v_liststring := value || list_const_p.list_sep;
end if;
return ODCIConst.Success;
end;


member function ODCIAggregateTerminate(
self IN OUT listagg_clob_t,
returnValue OUT clob,
flags IN number
) return number is
begin
if self.v_templob != 0 then
dbms_lob.writeappend(self.v_clob, length(self.v_liststring), self.v_liststring);
dbms_lob.trim(self.v_clob, dbms_lob.getlength(self.v_clob) - 1);
else
self.v_clob := substr(self.v_liststring, 1, length(self.v_liststring) - 1);
end if;
returnValue := self.v_clob;
return ODCIConst.Success;
end;


member function ODCIAggregateMerge(self IN OUT listagg_clob_t, ctx2 IN OUT listagg_clob_t) return number is
begin
if ctx2.v_templob != 0 then
if self.v_templob != 0 then
dbms_lob.append(self.v_clob, ctx2.v_clob);
dbms_lob.freetemporary(ctx2.v_clob);
ctx2.v_templob := 0;
else
self.v_clob := ctx2.v_clob;
self.v_templob := 1;
ctx2.v_clob := '';
ctx2.v_templob := 0;
end if;
end if;
if nvl(lengthb(self.v_liststring),0) + nvl(lengthb(ctx2.v_liststring),0) <= 4000 then
self.v_liststring := self.v_liststring || ctx2.v_liststring;
ctx2.v_liststring := '';
else
if self.v_templob = 0 then
dbms_lob.createtemporary(self.v_clob, true, dbms_lob.call);
self.v_templob := 1;
end if;
dbms_lob.writeappend(self.v_clob, length(self.v_liststring), self.v_liststring);
dbms_lob.writeappend(self.v_clob, length(ctx2.v_liststring), ctx2.v_liststring);
self.v_liststring := '';
ctx2.v_liststring := '';
end if;
return ODCIConst.Success;
end;
end;
/
sho err


CREATE or replace FUNCTION listagg_clob (input varchar2) RETURN clob
PARALLEL_ENABLE AGGREGATE USING listagg_clob_t;
/
sho err

I wrote a function to handle this using regular expressions. The in parameters are: 1) listagg 调用本身 2) A repeat of the delimiter

create or replace function distinct_listagg
(listagg_in varchar2,
delimiter_in varchar2)


return varchar2
as
hold_result varchar2(4000);
begin


select rtrim( regexp_replace( (listagg_in)
, '([^'||delimiter_in||']*)('||
delimiter_in||'\1)+($|'||delimiter_in||')', '\1\3'), ',')
into hold_result
from dual;


return hold_result;


end;

现在你不必每次都重复正则表达式,只需说:

select distinct_listagg(
listagg(myfield,', ') within group (order by 1),
', '
)
from mytable;

The simplest way to handle multiple listagg's is to use 1 WITH (subquery factor) per column containing a listagg of that column from a select distinct:

    WITH tab AS
(
SELECT 1 as col1, 2 as col2, 3 as col3, 'Smith' as created_by FROM dual
UNION ALL SELECT 1 as col1, 2 as col2, 3 as col3,'John'  as created_by FROM dual
UNION ALL SELECT 1 as col1, 3 as col2, 4 as col3,'Ajay'  as created_by FROM dual
UNION ALL SELECT 1 as col1, 4 as col2, 4 as col3,'Ram'   as created_by FROM dual
UNION ALL SELECT 1 as col1, 5 as col2, 6 as col3,'Jack'  as created_by FROM dual
)
, getCol2 AS
(
SELECT  DISTINCT col1, listagg(col2,',') within group (order by col2)  over (partition by col1) AS col2List
FROM ( SELECT DISTINCT col1,col2 FROM tab)
)
, getCol3 AS
(
SELECT  DISTINCT col1, listagg(col3,',') within group (order by col3)  over (partition by col1) AS col3List
FROM ( SELECT DISTINCT col1,col3 FROM tab)
)
select col1,col2List,col3List
FROM getCol2
JOIN getCol3
using (col1)

结果是:

col1  col2List  col3List
1     2,3,4,5   3,4,6

如果不需要特定顺序的连接值,并且分隔符可以是逗号,那么可以这样做:

select col1, stragg(distinct col2)
from table
group by col1

I neded a DISTINCT version of this and got this one working out.

RTRIM(REGEXP_REPLACE(
(value, ', ') WITHIN GROUP( ORDER BY value)),
'([^ ]+)(, \1)+','\1'),', ')

使用 LISTAGG的一个恼人的方面是,如果连接的字符串的总长度超过4000个字符(SQL 中的 VARCHAR2的限制) ,将抛出下面的错误,这在 Oracle 版本高达12.1的情况下很难管理

ORA-01489: 字符串连接的结果太长

在12cR2中添加的一个新特性是 LISTAGGON OVERFLOW子句。 包含该子句的查询类似于:

SELECT pid, LISTAGG(Desc, ' ' on overflow truncate) WITHIN GROUP (ORDER BY seq) AS desc
FROM B GROUP BY pid;

以上操作将输出限制为4000个字符,但不会抛出 ORA-01489错误。

以下是 ON OVERFLOW条款的一些附加选择:

  • ON OVERFLOW TRUNCATE 'Contd..': 这将显示 'Contd..'在 字符串的结尾(默认值为 ...)
  • ON OVERFLOW TRUNCATE '': 这将显示4000个字符 without any terminating string.
  • ON OVERFLOW TRUNCATE WITH COUNT: 这将显示总计 结束字符后的字符数。 例句:-‘ ...(5512)
  • ON OVERFLOW ERROR: 如果您期望 LISTAGG失败,则 ORA-01489错误(无论如何都是默认的)。

我实现了这个存储函数:

CREATE TYPE LISTAGG_DISTINCT_PARAMS AS OBJECT (ELEMENTO VARCHAR2(2000), SEPARATORE VARCHAR2(10));


CREATE TYPE T_LISTA_ELEMENTI AS TABLE OF VARCHAR2(2000);


CREATE TYPE T_LISTAGG_DISTINCT AS OBJECT (


LISTA_ELEMENTI T_LISTA_ELEMENTI,
SEPARATORE VARCHAR2(10),


STATIC FUNCTION ODCIAGGREGATEINITIALIZE(SCTX  IN OUT            T_LISTAGG_DISTINCT)
RETURN NUMBER,


MEMBER FUNCTION ODCIAGGREGATEITERATE   (SELF  IN OUT            T_LISTAGG_DISTINCT,
VALUE IN                    LISTAGG_DISTINCT_PARAMS )
RETURN NUMBER,


MEMBER FUNCTION ODCIAGGREGATETERMINATE (SELF         IN     T_LISTAGG_DISTINCT,
RETURN_VALUE OUT    VARCHAR2,
FLAGS        IN     NUMBER      )
RETURN NUMBER,


MEMBER FUNCTION ODCIAGGREGATEMERGE       (SELF               IN OUT T_LISTAGG_DISTINCT,
CTX2                 IN         T_LISTAGG_DISTINCT    )
RETURN NUMBER
);


CREATE OR REPLACE TYPE BODY T_LISTAGG_DISTINCT IS


STATIC FUNCTION ODCIAGGREGATEINITIALIZE(SCTX IN OUT T_LISTAGG_DISTINCT) RETURN NUMBER IS
BEGIN
SCTX := T_LISTAGG_DISTINCT(T_LISTA_ELEMENTI() , ',');
RETURN ODCICONST.SUCCESS;
END;


MEMBER FUNCTION ODCIAGGREGATEITERATE(SELF IN OUT T_LISTAGG_DISTINCT, VALUE IN LISTAGG_DISTINCT_PARAMS) RETURN NUMBER IS
BEGIN


IF VALUE.ELEMENTO IS NOT NULL THEN
SELF.LISTA_ELEMENTI.EXTEND;
SELF.LISTA_ELEMENTI(SELF.LISTA_ELEMENTI.LAST) := TO_CHAR(VALUE.ELEMENTO);
SELF.LISTA_ELEMENTI:= SELF.LISTA_ELEMENTI MULTISET UNION DISTINCT SELF.LISTA_ELEMENTI;
SELF.SEPARATORE := VALUE.SEPARATORE;
END IF;
RETURN ODCICONST.SUCCESS;
END;


MEMBER FUNCTION ODCIAGGREGATETERMINATE(SELF IN T_LISTAGG_DISTINCT, RETURN_VALUE OUT VARCHAR2, FLAGS IN NUMBER) RETURN NUMBER IS
STRINGA_OUTPUT            CLOB:='';
LISTA_OUTPUT                T_LISTA_ELEMENTI;
TERMINATORE                 VARCHAR2(3):='...';
LUNGHEZZA_MAX           NUMBER:=4000;
BEGIN


IF SELF.LISTA_ELEMENTI.EXISTS(1) THEN -- se esiste almeno un elemento nella lista


-- inizializza una nuova lista di appoggio
LISTA_OUTPUT := T_LISTA_ELEMENTI();


-- riversamento dei soli elementi in DISTINCT
LISTA_OUTPUT := SELF.LISTA_ELEMENTI MULTISET UNION DISTINCT SELF.LISTA_ELEMENTI;


-- ordinamento degli elementi
SELECT CAST(MULTISET(SELECT * FROM TABLE(LISTA_OUTPUT) ORDER BY 1 ) AS T_LISTA_ELEMENTI ) INTO LISTA_OUTPUT FROM DUAL;


-- concatenazione in una stringa
FOR I IN LISTA_OUTPUT.FIRST .. LISTA_OUTPUT.LAST - 1
LOOP
STRINGA_OUTPUT := STRINGA_OUTPUT || LISTA_OUTPUT(I) || SELF.SEPARATORE;
END LOOP;
STRINGA_OUTPUT := STRINGA_OUTPUT || LISTA_OUTPUT(LISTA_OUTPUT.LAST);


-- se la stringa supera la dimensione massima impostata, tronca e termina con un terminatore
IF LENGTH(STRINGA_OUTPUT) > LUNGHEZZA_MAX THEN
RETURN_VALUE := SUBSTR(STRINGA_OUTPUT, 0, LUNGHEZZA_MAX - LENGTH(TERMINATORE)) || TERMINATORE;
ELSE
RETURN_VALUE:=STRINGA_OUTPUT;
END IF;


ELSE -- se non esiste nessun elemento, restituisci NULL


RETURN_VALUE := NULL;


END IF;


RETURN ODCICONST.SUCCESS;
END;


MEMBER FUNCTION ODCIAGGREGATEMERGE(SELF IN OUT T_LISTAGG_DISTINCT, CTX2 IN T_LISTAGG_DISTINCT) RETURN NUMBER IS
BEGIN
RETURN ODCICONST.SUCCESS;
END;


END; -- fine corpo


CREATE
FUNCTION LISTAGG_DISTINCT (INPUT LISTAGG_DISTINCT_PARAMS) RETURN VARCHAR2
PARALLEL_ENABLE AGGREGATE USING T_LISTAGG_DISTINCT;


// Example
SELECT LISTAGG_DISTINCT(LISTAGG_DISTINCT_PARAMS(OWNER, ', ')) AS LISTA_OWNER
FROM SYS.ALL_OBJECTS;

对不起,但是在某些情况下(对于一个非常大的集合) ,Oracle 可能返回这个错误:

Object or Collection value was too large. The size of the value
might have exceeded 30k in a SORT context, or the size might be
too big for available memory.

但我认为这是一个良好的开端;)

即将推出的 Oracle 19c 将支持 DISTINCTLISTAGG

LISTAGG with DISTINCT 选项 :

这个功能是19c:

SQL> select deptno, listagg (distinct sal,', ') within group (order by sal)
2  from scott.emp
3  group by deptno;

编辑:

Oracle 19C ListAGG 与众不同

LISTAGG 聚合函数现在支持使用新的 DISTINCT 关键字消除重复。LISTAGG 聚合函数根据 ORDER BY 表达式对查询中每个组的行进行排序,然后将值连接到单个字符串中。使用新的 DISTINCT 关键字,可以在连接到单个字符串之前从指定的表达式中删除重复值。使用 DISTINCT 选项,可以直接在 LISTAGG 函数中执行删除重复值的处理。结果是 SQL 更简单、更快、更有效。

正如@a _ horse _ with _ no _ name 所指出的,在调用 LISTAGG 之前将 SELECT DISTINCT ...作为子查询的一部分可能是简单查询的最佳方式

However, in more complex queries, it might not be possible, or easy, to accomplish this. I had this come up in a scenario that was using top-n approach using an analytic function.

我找到了 COLLECT聚合函数。有文件证明,有 UNIQUEDISTINCT修饰符可用。只有 10克,它悄悄地失败(它忽略修饰符没有错误)。然而,为了克服这个问题,从 另一个答案,我想到了这个解决方案:

SELECT
...
(
SELECT LISTAGG(v.column_value,',') WITHIN GROUP (ORDER BY v.column_value)
FROM TABLE(columns_tab) v
) AS columns,
...
FROM (
SELECT
...
SET(CAST(COLLECT(UNIQUE some_column ORDER BY some_column) AS tab_typ)) AS columns_tab,
...
)

基本上,通过使用 SET,我删除了集合中的重复项。

您仍然需要将 tab_typ定义为一个基本的集合类型,对于 VARCHAR,例如:

CREATE OR REPLACE type tab_typ as table of varchar2(100)
/

另外,作为对多列情况下@a _ horse _ with _ no _ name 的回答的更正,在这种情况下,您可能希望仍然聚合第三个(或更多)列:

select
col1,
listagg(CASE rn2 WHEN 1 THEN col2 END, ',') within group (order by col2) AS col2_list,
listagg(CASE rn3 WHEN 1 THEN col3 END, ',') within group (order by col3) AS col3_list,
SUM(col4) AS col4
from (
select
col1,
col2,
row_number() over (partition by col1, col2 order by null) as rn2,
row_number() over (partition by col1, col3 order by null) as rn3
from foo
)
group by col1;

如果将 rn = 1作为 where 条件留给查询,则会错误地聚合其他列。

select col1, listaggr(col2,',') within group(Order by col2) from table group by col1意味着将字符串(col2)聚合成保持 n 顺序的列表,然后以 col1作为组处理重复项,col1意味着将 col1重复项合并为1组。也许这看起来干净简单,因为它应该是 如果你也想要 col3,你只需要再添加一个 listagg ()即 select col1, listaggr(col2,',') within group(Order by col2),listaggr(col3,',') within group(order by col3) from table group by col1

使用 DECODE vs CASE (我在这里看到的)进一步细化@YoYo 对@a _ horse _ with _ no _ name 的基于 row _ number ()的方法的更正。我看到@Martin Vrbovsky 也有这样的案例分析答案。

select
col1,
listagg(col2, ',') within group (order by col2) AS col2_list,
listagg(col3, ',') within group (order by col3) AS col3_list,
SUM(col4) AS col4
from (
select
col1,
decode(row_number() over (partition by col1, col2 order by null),1,col2) as col2,
decode(row_number() over (partition by col1, col3 order by null),1,col3) as col3
from foo
)
group by col1;

非常简单-在您的查询中使用一个子查询,其中有一个截然不同的选择:

SELECT question_id,
LISTAGG(element_id, ',') WITHIN GROUP (ORDER BY element_id)
FROM
(SELECT distinct question_id, element_id
FROM YOUR_TABLE)
GROUP BY question_id;