有一个现存的问题,询问如何查找多个日期范围内的分钟数,而忽略重叠。
给出的示例数据是(userID并不特别相关)
--Available-- ID userID availStart availEnd 1 456 '2012-11-19 16:00' '2012-11-19 17:00' 2 456 '2012-11-19 16:00' '2012-11-19 16:50' 3 456 '2012-11-19 18:00' '2012-11-19 18:30' 4 456 '2012-11-19 17:30' '2012-11-19 18:10' 5 456 '2012-11-19 16:00' '2012-11-19 17:10' 6 456 '2012-11-19 16:00' '2012-11-19 16:50'
我可以使用游标解决问题,但我认为它应该适合CTE,但是我不知道该怎么做。
方法是按开始时间排列每个范围,然后我们建立一个按顺序合并范围的范围,直到找到不与合并范围重叠的范围。然后,我们计算合并范围内的分钟数,并记住这一点。我们继续进行下一个范围,再次合并任何重叠的范围。每次获得不重叠的起点时,我们都会累积分钟数。最后,我们将累积的分钟数添加到最后一个范围的长度上
很容易看到,由于顺序的缘故,一旦一个范围与之前去过的区域不同,则没有其他范围可以与之前去过的区域重叠,因为它们的开始日期都更大。
Declare @UserID int = 456, @CurStart datetime, -- our current coalesced range start @CurEnd datetime, -- our current coalesced range end @AvailStart datetime, -- start or range for our next row of data @AvailEnd datetime, -- end of range for our next row of data @AccumMinutes int = 0 -- how many minutes so far accumulated by distinct ranges Declare MinCursor Cursor Fast_Forward For Select AvailStart, AvailEnd From dbo.Available Where UserID = @UserID Order By AvailStart Open MinCursor Fetch Next From MinCursor Into @AvailStart, @AvailEnd Set @CurStart = @AvailStart Set @CurEnd = @AvailEnd While @@Fetch_Status = 0 Begin If @AvailStart <= @CurEnd -- Ranges Overlap, so coalesce and continue Begin If @AvailEnd > @CurEnd Set @CurEnd = @AvailEnd End Else -- Distinct range, coalesce minutes from previous range Begin Set @AccumMinutes = @AccumMinutes + DateDiff(Minute, @CurStart, @CurEnd) Set @CurStart = @AvailStart -- Start coalescing a new range Set @CurEnd = @AvailEnd End Fetch Next From MinCursor Into @AvailStart, @AvailEnd End Select @AccumMinutes + DateDiff(Minute, @CurStart, @CurEnd) As TotalMinutes Close MinCursor Deallocate MinCursor;
让CTE正常工作,这只是递归中的一个愚蠢错误。查询计划的爆炸非常令人印象深刻:
With OrderedRanges as ( Select Row_Number() Over (Partition By UserID Order By AvailStart) AS RN, AvailStart, AvailEnd From dbo.Available Where UserID = 456 ), AccumulateMinutes (RN, Accum, CurStart, CurEnd) as ( Select RN, 0, AvailStart, AvailEnd From OrderedRanges Where RN = 1 Union All Select o.RN, a.Accum + Case When o.AvailStart <= a.CurEnd Then 0 Else DateDiff(Minute, a.CurStart, a.CurEnd) End, Case When o.AvailStart <= a.CurEnd Then a.CurStart Else o.AvailStart End, Case When o.AvailStart <= a.CurEnd Then Case When a.CurEnd > o.AvailEnd Then a.CurEnd Else o.AvailEnd End Else o.AvailEnd End From AccumulateMinutes a Inner Join OrderedRanges o On a.RN = o.RN - 1 ) Select Max(Accum + datediff(Minute, CurStart, CurEnd)) From AccumulateMinutes
这是否适合CTE,是否有通用的模式以这种方式累积列表?
http://sqlfiddle.com/#!6/ac021/2
以下查询根据您的定义在数据中查找周期。它首先使用相关的子查询来确定记录是否是一个周期的开始(即与较早的时间段没有重叠)。然后,它将“ periodStart”分配为最近的开始,即不重叠周期的开始。
以下(未经测试的)查询采用了这种方法:
with TimeWithOverlap as ( select t.*, (case when exists (select * from dbo.Available tbefore where t.availStart > tbefore.availStart and tbefore.availEnd >= t.availStart) then 0 else 1 end) as IsPeriodStart from dbo.Available t ), TimeWithPeriodStart as ( select two.*, (select MAX(two1.AvailStart) from TimeWithOverlap two1 where IsPeriodStart = 1 and two1.AvailStart <= two.AvailStart ) as periodStart from TimeWithOverlap two ) select periodStart, MAX(AvailEnd) as periodEnd from TimeWithPeriodStart twps group by periodStart;
http://sqlfiddle.com/#!6/3483c/20(第二个查询)
如果两个周期都同时开始,则它仍然有效,因为AvailStart值相同。由于相关的子查询,即使在中等大小的数据集上,这也可能效果不佳。
还有其他方法可以解决此问题。例如,如果您有SQL Server 2012,则可以使用累积和函数,该函数提供了一种更简单的方法。