小编典典

在Postgres中查询连续列的“运行”

sql

我有一张table:

create table table1 (event_id integer, event_time timestamp without time zone);
insert into table1 (event_id, event_time) values
(1, '2011-01-01 00:00:00'),
(2, '2011-01-01 00:00:15'),
(3, '2011-01-01 00:00:29'),
(4, '2011-01-01 00:00:58'),
(5, '2011-01-02 06:03:00'),
(6, '2011-01-02 06:03:09'),
(7, '2011-01-05 11:01:31'),
(8, '2011-01-05 11:02:15'),
(9, '2011-01-06 09:34:19'),
(10, '2011-01-06 09:34:41'),
(11, '2011-01-06 09:35:06');

我想构造一个语句,给定一个事件可以返回从该事件开始的事件“运行”的长度。运行由以下方式定义:

  1. 如果两个事件之间的时间间隔不超过30秒,则两个事件将同时运行。
  2. 如果A和B一起运行,并且B和C一起运行,则A与C一起运行。

但是,我的查询不需要在时间上倒退,因此,如果我选择事件2,则仅将事件2、3和4计为从2开始的事件运行的一部分,而应将3作为事件返回。运行时间。

有任何想法吗?我很困惑


阅读 233

收藏
2021-04-07

共1个答案

小编典典

这是递归CTE解决方案。(孤岛和空白问题自然会导致递归CTE)

WITH RECURSIVE runrun AS (
    SELECT event_id, event_time
    , event_time - ('30 sec'::interval) AS low_time
    , event_time + ('30 sec'::interval) AS high_time
    FROM table1
    UNION
    SELECT t1.event_id, t1.event_time
    , LEAST ( rr.low_time, t1.event_time - ('30 sec'::interval) ) AS low_time
    , GREATEST ( rr.high_time, t1.event_time + ('30 sec'::interval) ) AS high_time
    FROM table1 t1
    JOIN runrun rr ON t1.event_time >= rr.low_time
                  AND t1.event_time < rr.high_time
    )
SELECT DISTINCT ON (event_id) *
FROM runrun rr
WHERE rr.event_time >= '2011-01-01 00:00:15'
AND rr.low_time <= '2011-01-01 00:00:15'
AND rr.high_time > '2011-01-01 00:00:15'
    ;

结果:

 event_id |     event_time      |      low_time       |      high_time      
----------+---------------------+---------------------+---------------------
        2 | 2011-01-01 00:00:15 | 2010-12-31 23:59:45 | 2011-01-01 00:00:45
        3 | 2011-01-01 00:00:29 | 2010-12-31 23:59:45 | 2011-01-01 00:01:28
        4 | 2011-01-01 00:00:58 | 2010-12-31 23:59:30 | 2011-01-01 00:01:28
(3 rows)
2021-04-07