在 Oracle 中从表中删除重复行

我在 Oracle 中测试了一些东西,并用一些示例数据填充了一个表,但是在这个过程中我意外地加载了重复的记录,所以现在我不能使用一些列来创建主键。

如何删除所有重复行并只留下其中一行?

609914 次浏览

使用 rowid假列。

DELETE FROM your_table
WHERE rowid not in
(SELECT MIN(rowid)
FROM your_table
GROUP BY column1, column2, column3);

其中 column1column2column3组成每条记录的标识键。您可以列出所有列。

来自 DevX.com:

DELETE FROM our_table
WHERE rowid not in
(SELECT MIN(rowid)
FROM our_table
GROUP BY column1, column2, column3...) ;

其中 column 1、 column 2等是要使用的键。

您应该使用游标 for 循环执行一个小的 pl/sql 块,并删除不想保留的行。例如:

declare
prev_var my_table.var1%TYPE;


begin


for t in (select var1 from my_table order by var 1) LOOP


-- if previous var equal current var, delete the row, else keep on going.
end loop;


end;
DELETE FROM tablename a
WHERE a.ROWID > ANY (SELECT b.ROWID
FROM tablename b
WHERE a.fieldname = b.fieldname
AND a.fieldname2 = b.fieldname2)

来自 问问汤姆

delete from t
where rowid IN ( select rid
from (select rowid rid,
row_number() over (partition by
companyid, agentid, class , status, terminationdate
order by rowid) rn
from t)
where rn <> 1);

(修正了漏掉的括号)

创建与 t1截然不同的表 t2;

create or replace procedure delete_duplicate_enq as
cursor c1 is
select *
from enquiry;
begin
for z in c1 loop
delete enquiry
where enquiry.enquiryno = z.enquiryno
and rowid > any
(select rowid
from enquiry
where enquiry.enquiryno = z.enquiryno);
end loop;
end delete_duplicate_enq;
DELETE FROM tableName  WHERE ROWID NOT IN (SELECT   MIN (ROWID) FROM table GROUP BY columnname);
delete from dept
where rowid in (
select rowid
from dept
minus
select max(rowid)
from dept
group by DEPTNO, DNAME, LOC
);

对于大桌子来说最快的方法

  1. 创建结构如下的异常表: 例外 _ 表

    ROW_ID ROWID
    OWNER VARCHAR2(30)
    TABLE_NAME VARCHAR2(30)
    CONSTRAINT VARCHAR2(30)
    
  2. Try create a unique constraint or primary key which will be violated by the duplicates. You will get an error message because you have duplicates. The exceptions table will contain the rowids for the duplicate rows.

    alter table add constraint
    unique --or primary key
    (dupfield1,dupfield2) exceptions into exceptions_table;
    
  3. Join your table with exceptions_table by rowid and delete dups

    delete original_dups where rowid in (select ROW_ID from exceptions_table);
    
  4. If the amount of rows to delete is big, then create a new table (with all grants and indexes) anti-joining with exceptions_table by rowid and rename the original table into original_dups table and rename new_table_with_no_dups into original table

    create table new_table_with_no_dups AS (
    select field1, field2 ........
    from original_dups t1
    where not exists ( select null from exceptions_table T2 where t1.rowid = t2.row_id )
    )
    

要选择副本,只能使用以下查询格式:

SELECT GroupFunction(column1), GroupFunction(column2),...,
COUNT(column1), column1, column2...
FROM our_table
GROUP BY column1, column2, column3...
HAVING COUNT(column1) > 1

因此,根据其他建议,正确的问题是:

DELETE FROM tablename a
WHERE a.ROWID > ANY (SELECT b.ROWID
FROM tablename b
WHERE a.fieldname = b.fieldname
AND a.fieldname2 = b.fieldname2
AND ....so on.. to identify the duplicate rows....)

这个查询将为 WHERE CLAUSE中选择的条件保留数据库中最早的记录。

甲骨文认证助理(2008)

为了获得最佳表现,我这样写道:
(见执行计划)

DELETE FROM your_table
WHERE rowid IN
(select t1.rowid from your_table  t1
LEFT OUTER JOIN (
SELECT MIN(rowid) as rowid, column1,column2, column3
FROM your_table
GROUP BY column1, column2, column3
)  co1 ON (t1.rowid = co1.rowid)
WHERE co1.rowid IS NULL
);

使用 rowid-

delete from emp
where rowid not in
(select max(rowid) from emp group by empno);

使用自连接-

delete from emp e1
where rowid not in
(select max(rowid) from emp e2
where e1.empno = e2.empno );

解决方案1)

delete from emp
where rowid not in
(select max(rowid) from emp group by empno);

解决方案2)

delete from emp where rowid in
(
select rid from
(
select rowid rid,
row_number() over(partition by empno order by empno) rn
from emp
)
where rn > 1
);

解决方案3)

delete from emp e1
where rowid not in
(select max(rowid) from emp e2
where e1.empno = e2.empno );

解决方案4)

 delete from emp where rowid in
(
select rid from
(
select rowid rid,
dense_rank() over(partition by empno order by rowid
) rn
from emp
)
where rn > 1
);

1. 解决方案

delete from emp
where rowid not in
(select max(rowid) from emp group by empno);

2. 荡妇

delete from emp where rowid in
(
select rid from
(
select rowid rid,
row_number() over(partition by empno order by empno) rn
from emp
)
where rn > 1
);

3. 解决方案

delete from emp e1
where rowid not in
(select max(rowid) from emp e2
where e1.empno = e2.empno );

4. 解决方案

 delete from emp where rowid in
(
select rid from
(
select rowid rid,
dense_rank() over(partition by empno order by rowid
) rn
from emp
)
where rn > 1
);

5. 解决方案

delete from emp where rowid in
(
select  rid from
(
select rowid rid,rank() over (partition by emp_id order by rowid)rn from emp
)
where rn > 1
);
DELETE from table_name where rowid not in (select min(rowid) FROM table_name group by column_name);

还可以用另一种方式删除重复记录

DELETE from table_name a where rowid > (select min(rowid) FROM table_name b where a.column=b.column);
create table abcd(id number(10),name varchar2(20))


insert into abcd values(1,'abc')


insert into abcd values(2,'pqr')




insert into abcd values(3,'xyz')


insert into abcd values(1,'abc')


insert into abcd values(2,'pqr')


insert into abcd values(3,'xyz')




select * from abcd
id  Name
1   abc
2   pqr
3   xyz
1   abc
2   pqr
3   xyz


Delete Duplicate record but keep Distinct Record in table


DELETE
FROM abcd a
WHERE ROWID > (SELECT MIN(ROWID) FROM abcd b
WHERE b.id=a.id
);


run the above query 3 rows delete


select * from abcd


id  Name
1   abc
2   pqr
3   xyz

检查下面的脚本-

1.

Create table test(id int,sal int);

2.

    insert into test values(1,100);
insert into test values(1,100);
insert into test values(2,200);
insert into test values(2,200);
insert into test values(3,300);
insert into test values(3,300);
commit;

3.

 select * from test;

你会在这里看到6张唱片。
运行如下查询-

delete from
test
where rowid in
(select rowid from
(select
rowid,
row_number()
over
(partition by id order by sal) dup
from test)
where dup > 1)
  1. select * from test;

您将看到重复的记录已被删除。
希望这能解决你的疑问。 谢谢:)

我没有看到任何使用常见表表达式和窗口函数的答案。 这是我觉得最容易处理的。

DELETE FROM
YourTable
WHERE
ROWID IN
(WITH Duplicates
AS (SELECT
ROWID RID,
ROW_NUMBER()
OVER(
PARTITION BY First_Name, Last_Name, Birth_Date)
AS RN
SUM(1)
OVER(
PARTITION BY First_Name, Last_Name, Birth_Date
ORDER BY ROWID ROWS BETWEEN UNBOUNDED PRECEDING
AND UNBOUNDED FOLLOWING)
AS CNT
FROM
YourTable
WHERE
Load_Date IS NULL)
SELECT
RID
FROM
duplicates
WHERE
RN > 1);

需要注意的是:

1)我们只检查分割子句中的字段是否重复。

2)如果你有理由选择一个副本而不是其他副本,你可以使用一个 order by 子句使得这一行的 row _ number () = 1

3)你可以通过把最后一个 Where 子句改为“ Where RN > N”(N > = 1)来修改保留的数字副本(我认为 N = 0会删除所有有副本的行,但它只会删除所有行)。

4)添加了 Sum 分区字段的 CTE 查询,该查询将用组中的数字行标记每一行。因此,要选择具有重复项(包括第一项)的行,请使用“ WHERE cnt > 1”。

解决方案:

delete from emp where rowid in
(
select rid from
(
select rowid rid,
row_number() over(partition by empno order by empno) rn
from emp
)
where rn > 1
);

这篇博客文章 对一般情况很有帮助:

如果行被完全复制(所有列中的所有值都可以有副本) ,那么就没有列可以使用了!但是为了保持一个唯一标识符,你仍然需要为每组中的每一行设置一个标记。 幸运的是,Oracle 已经有了一些您可以使用的东西。 Oracle 中的所有行都有一个 rowid。这是个物理定位器。也就是说,它指出 Oracle 在磁盘上存储行的位置。这对每一行都是独一无二的。因此,可以使用此值来标识和删除副本。为此,在不相关的 delete 中将 min ()替换为 min (rowid) :

delete films
where  rowid not in (
select min(rowid)
from   films
group  by title, uk_release_date
)

这与上面的答案类似,但给了我一个更好的解释方案:

delete from your_table
where rowid in (
select max(rowid)
from your_table
group by column1, column2, column3
having count(*) > 1
);