我有一个按名称Customer_SCD列出的客户表:在SQL中,其中有3列:Customer_Name,Customer_ID Customer_TimeStamp
Customer_SCD
Customer_Name
Customer_ID
Customer_TimeStamp
此表中有重复的条目,但 时间戳记 不同。
例如
ABC, 1, 2012-12-05 11:58:20.370 ABC, 1, 2012-12-03 12:11:09.840
我想从数据库中消除此问题,并保持第一时间/日期可用。
谢谢。
这有效,请尝试:
DELETE Customer_SCD OUTPUT deleted.* FROM Customer_SCD b JOIN ( SELECT MIN(a.Customer_TimeStamp) Customer_TimeStamp, Customer_ID, Customer_Name FROM Customer_SCD a GROUP BY a.Customer_ID, a.Customer_Name ) c ON c.Customer_ID = b.Customer_ID AND c.Customer_Name = b.Customer_Name AND c.Customer_TimeStamp <> b.Customer_TimeStamp
在子查询中,它确定哪个记录是每个的第一个记录Customer_Name,Customer_ID然后删除所有其他记录以作重复。我还添加了该OUTPUT子句,该子句返回受该语句影响的行。
OUTPUT
您也可以通过使用排名功能来做到这一点ROW_NUMBER:
ROW_NUMBER
DELETE Customer_SCD OUTPUT deleted.* FROM Customer_SCD b JOIN ( SELECT Customer_ID, Customer_Name, Customer_TimeStamp, ROW_NUMBER() OVER (PARTITION BY Customer_ID, Customer_Name ORDER BY Customer_TimeStamp) num FROM Customer_SCD ) c ON c.Customer_ID = b.Customer_ID AND c.Customer_Name = b.Customer_Name AND c.Customer_TimeStamp = b.Customer_TimeStamp AND c.num <> 1
看看哪一个查询开销较小并使用它,当我检查它时,第一种方法效率更高(它具有更好的执行计划)。
这是一个 SQL小提琴