我有一个临时数据库表,其中某些数据是重复的。
EmployeeId StartDate EndDate Column1 Column2 1000 2009/05/01 2010/04/30 X Y 1000 2010/05/01 2011/04/30 X Y 1000 2011/05/01 2012/04/30 X X 1000 2012/05/01 2013/04/30 X Y 1000 2013/05/01 NULL X X
如上所示,有些行是多余的,可以合并为单个行而不会违反数据有效性。我想尽可能合并这样的行,结果应该像这样
EmployeeId StartDate EndDate Column1 Column2 1000 2009/05/01 2011/04/30 X Y 1000 2011/05/01 2012/04/30 X X 1000 2012/05/01 2013/04/30 X Y 1000 2013/05/01 NULL X X
如何做到这一点?
如果可以确保所有开始日期和结束日期都是连续的,请尝试以下操作:
with t1 as --tag first row with 1 in a continuous time series ( select t1.*, case when t1.column1=t2.column1 and t1.column2=t2.column2 then 0 else 1 end as tag from your_table t1 left join your_table t2 on t1.EmployeeId= t2.EmployeeId and dateadd(day,-1,t1.StartDate)= t2.EndDate ) select t1.EmployeeId, t1.StartDate, case when min(T2.StartDate) is null then null else dateadd(day,-1,min(T2.StartDate)) end as EndDate, t1.Column1, t1.Column2 from (select t1.* from t1 where tag=1 ) as t1 -- to get StartDate left join (select t1.* from t1 where tag=1 ) as t2 -- to get a new EndDate on t1.EmployeeId= t2.EmployeeId and t1.StartDate < t2.StartDate group by t1.EmployeeId, t1.StartDate, t1.Column1, t1.Column2