admin

SQL Server 2008:删除重复的行

sql

我的表中有重复的行,如何根据单列的值删除它们?

例如

uniqueid, col2, col3 ...
1, john, simpson
2, sally, roberts
1, johnny, simpson

delete any duplicate uniqueIds
to get

1, John, Simpson
2, Sally, Roberts

阅读 168

收藏
2021-05-10

共1个答案

admin

您可以DELETE从CTE:

WITH cte AS (SELECT *,ROW_NUMBER() OVER(PARTITION BY uniqueid ORDER BY col2)'RowRank'
             FROM Table)
DELETE FROM cte 
WHERE RowRank > 1

ROW_NUMBER()函数为每行分配一个数字。 PARTITION BY用于从该组中的每个项目开始编号,在这种情况下,的每个值uniqueid将从1开始编号并从该位置开始递增。 ORDER BY确定数字的顺序。由于每个uniqueid数字都从1开始编号,因此任何ROW_NUMBER()大于1的记录都具有重复项uniqueid

要了解该ROW_NUMBER()函数的工作原理,只需尝试一下:

SELECT *,ROW_NUMBER() OVER(PARTITION BY uniqueid ORDER BY col2)'RowRank'
FROM Table
ORDER BY uniqueid

您可以调整ROW_NUMBER()函数的逻辑,以调整要保留或删除的记录。

例如,也许您想分多个步骤进行操作,首先删除姓氏相同但名字不同的记录,则可以将姓氏添加到PARTITION BY

WITH cte AS (SELECT *,ROW_NUMBER() OVER(PARTITION BY uniqueid, col3 ORDER BY col2)'RowRank'
             FROM Table)
DELETE FROM cte 
WHERE RowRank > 1
2021-05-10