小编典典

查找忽略重叠的总分钟数(将基于光标的答案转换为CTE)

sql

有一个现存的问题,询问如何查找多个日期范围内的分钟数,而忽略重叠。

给出的示例数据是(userID并不特别相关)

--Available--
ID  userID  availStart          availEnd
1   456     '2012-11-19 16:00'  '2012-11-19 17:00'
2   456     '2012-11-19 16:00'  '2012-11-19 16:50'
3   456     '2012-11-19 18:00'  '2012-11-19 18:30'
4   456     '2012-11-19 17:30'  '2012-11-19 18:10'
5   456     '2012-11-19 16:00'  '2012-11-19 17:10'
6   456     '2012-11-19 16:00'  '2012-11-19 16:50'

我可以使用游标解决问题,但我认为它应该适合CTE,但是我不知道该怎么做。

方法是按开始时间排列每个范围,然后我们建立一个按顺序合并范围的范围,直到找到不与合并范围重叠的范围。然后,我们计算合并范围内的分钟数,并记住这一点。我们继续进行下一个范围,再次合并任何重叠的范围。每次获得不重叠的起点时,我们都会累积分钟数。最后,我们将累积的分钟数添加到最后一个范围的长度上

很容易看到,由于顺序的缘故,一旦一个范围与之前去过的区域不同,则没有其他范围可以与之前去过的区域重叠,因为它们的开始日期都更大。

Declare
  @UserID int = 456,
  @CurStart datetime, -- our current coalesced range start
  @CurEnd datetime, -- our current coalesced range end
  @AvailStart datetime, -- start or range for our next row of data
  @AvailEnd datetime, -- end of range for our next row of data
  @AccumMinutes int = 0 -- how many minutes so far accumulated by distinct ranges

Declare MinCursor Cursor Fast_Forward For
Select
  AvailStart, AvailEnd
From
  dbo.Available
Where
  UserID = @UserID
Order By
  AvailStart

Open MinCursor

Fetch Next From MinCursor Into @AvailStart, @AvailEnd
Set @CurStart = @AvailStart
Set @CurEnd = @AvailEnd

While @@Fetch_Status = 0
Begin
  If @AvailStart <= @CurEnd -- Ranges Overlap, so coalesce and continue
    Begin
    If @AvailEnd > @CurEnd 
      Set @CurEnd = @AvailEnd
    End
  Else -- Distinct range, coalesce minutes from previous range
  Begin
    Set @AccumMinutes = @AccumMinutes + DateDiff(Minute, @CurStart, @CurEnd)
    Set @CurStart = @AvailStart -- Start coalescing a new range
    Set @CurEnd = @AvailEnd
  End
  Fetch Next From MinCursor Into @AvailStart, @AvailEnd
End

Select @AccumMinutes + DateDiff(Minute, @CurStart, @CurEnd) As TotalMinutes

Close MinCursor
Deallocate MinCursor;

让CTE正常工作,这只是递归中的一个愚蠢错误。查询计划的爆炸非常令人印象深刻:

With OrderedRanges as (
  Select
    Row_Number() Over (Partition By UserID Order By AvailStart) AS RN,
    AvailStart,
    AvailEnd
  From
    dbo.Available
  Where
    UserID = 456
),
AccumulateMinutes (RN, Accum, CurStart, CurEnd) as (
  Select
    RN, 0, AvailStart, AvailEnd
  From
    OrderedRanges
  Where 
    RN = 1
  Union All
  Select
    o.RN, 
    a.Accum + Case When o.AvailStart <= a.CurEnd Then
        0
      Else 
        DateDiff(Minute, a.CurStart, a.CurEnd)
      End,
    Case When o.AvailStart <= a.CurEnd Then 
        a.CurStart
      Else
        o.AvailStart
      End,
    Case When o.AvailStart <= a.CurEnd Then
        Case When a.CurEnd > o.AvailEnd Then a.CurEnd Else o.AvailEnd End
      Else
        o.AvailEnd
      End
  From
    AccumulateMinutes a
        Inner Join 
    OrderedRanges o On 
        a.RN = o.RN - 1
)

Select Max(Accum + datediff(Minute, CurStart, CurEnd)) From AccumulateMinutes

这是否适合CTE,是否有通用的模式以这种方式累积列表?

http://sqlfiddle.com/#!6/ac021/2


阅读 163

收藏
2021-04-07

共1个答案

小编典典

以下查询根据您的定义在数据中查找周期。它首先使用相关的子查询来确定记录是否是一个周期的开始(即与较早的时间段没有重叠)。然后,它将“
periodStart”分配为最近的开始,即不重叠周期的开始。

以下(未经测试的)查询采用了这种方法:

with TimeWithOverlap as (
     select t.*,
            (case when exists (select * from dbo.Available tbefore where t.availStart > tbefore.availStart and tbefore.availEnd >= t.availStart)
                  then 0
                  else 1
             end) as IsPeriodStart
     from dbo.Available t 
    ),
    TimeWithPeriodStart as (
     select two.*,
            (select MAX(two1.AvailStart) from TimeWithOverlap two1 where IsPeriodStart = 1 and two1.AvailStart <= two.AvailStart
            ) as periodStart
     from TimeWithOverlap two
    )
select periodStart, MAX(AvailEnd) as periodEnd
from TimeWithPeriodStart twps
group by periodStart;

http://sqlfiddle.com/#!6/3483c/20(第二个查询)

如果两个周期都同时开始,则它仍然有效,因为AvailStart值相同。由于相关的子查询,即使在中等大小的数据集上,这也可能效果不佳。

还有其他方法可以解决此问题。例如,如果您有SQL Server 2012,则可以使用累积和函数,该函数提供了一种更简单的方法。

2021-04-07