删除所有重复的行除了一个在MySQL?

如何从MySQL表中删除所有重复数据?

以以下数据为例:

SELECT * FROM names;


+----+--------+
| id | name   |
+----+--------+
| 1  | google |
| 2  | yahoo  |
| 3  | msn    |
| 4  | google |
| 5  | google |
| 6  | yahoo  |
+----+--------+

如果它是一个SELECT查询,我会使用SELECT DISTINCT name FROM names;

我如何用DELETE做到这一点,只删除重复项,并只保留每个记录?

521388 次浏览

如果你想保留最低id值的行:

DELETE FROM NAMES
WHERE id NOT IN (SELECT *
FROM (SELECT MIN(n.id)
FROM NAMES n
GROUP BY n.name) x)

如果你想要最高的id值:

DELETE FROM NAMES
WHERE id NOT IN (SELECT *
FROM (SELECT MAX(n.id)
FROM NAMES n
GROUP BY n.name) x)

子查询中的子查询对于MySQL是必要的,否则您将得到一个1093错误。

编辑器警告:此解决方案计算效率低,可能会导致大型表的连接中断。

注意-您需要首先在您的表的测试副本上执行此操作!

当我这样做时,我发现除非我还包括AND n1.id <> n2.id,否则它会删除表中的每一行。

  1. 如果你想保留最低id值的行:

    DELETE n1 FROM names n1, names n2 WHERE n1.id > n2.id AND n1.name = n2.name
    
  2. If you want to keep the row with the highest id value:

    DELETE n1 FROM names n1, names n2 WHERE n1.id < n2.id AND n1.name = n2.name
    

I used this method in MySQL 5.1

Not sure about other versions.


Update: Since people Googling for removing duplicates end up here
Although the OP's question is about DELETE, please be advised that using INSERT and DISTINCT is much faster. For a database with 8 million rows, the below query took 13 minutes, while using DELETE, it took more than 2 hours and yet didn't complete.

INSERT INTO tempTableName(cellId,attributeId,entityRowId,value)
SELECT DISTINCT cellId,attributeId,entityRowId,value
FROM tableName;