假设我的桌子是
WITH `sample_project.sample_dataset.table` AS ( SELECT 'user1' user, 2 sequence, 'T1' ts UNION ALL SELECT 'user1', 2, 'T2' UNION ALL SELECT 'user1', 1, 'T3' UNION ALL SELECT 'user1', 1, 'T4' UNION ALL SELECT 'user1', 3, 'T5' UNION ALL SELECT 'user1', 2, 'T6' UNION ALL SELECT 'user1', 3, 'T7' UNION ALL SELECT 'user1', 3, 'T8' )
是否可以在不使用STRING_AGG和REGEX或JOIN操作的情况下在序列列中找到可用的整数子序列?这是为了提高查询效率。
子序列是String的一部分。例如,考虑字符串“ banana”,样本子序列为“ anna”,因为香蕉中“ anna”的每个索引字符都严格增加。子序列中的字符不必是连续的。
说了上面的表时,为了通过时间戳(增加),我会得到为22113233.顺序列在字符串22113233子STRING_AGG1 2 3可用,而子3 2 1是 不是 可用。给定一个子序列213,我怎么说这个子序列是否可用(22113233按时间戳排序)?
1 2 3
3 2 1
213
22113233
给定一个子序列213,如何说该子序列是否可用(在22113233中…
以下示例适用于BigQuery SQL
#standardSQL WITH `sequences` AS ( SELECT '22113233' sequence_list ), `subsequenses` AS ( SELECT '123' subsequence UNION ALL SELECT '321' UNION ALL SELECT '213' ) SELECT sequence_list, subsequence, REGEXP_CONTAINS(sequence_list, REGEXP_REPLACE(subsequence, '', '.*')) available FROM `sequences` l CROSS JOIN `subsequenses` s
结果如下
sequence_list subsequence available 22113233 321 false 22113233 123 true 22113233 213 true
如果您正在寻找特定的子序列-可以将其进一步简化为
#standardSQL WITH `sequences` AS ( SELECT '22113233' sequence_list UNION ALL SELECT '11223322' ) SELECT sequence_list, REGEXP_CONTAINS(sequence_list, REGEXP_REPLACE('213', '', '.*')) available FROM `sequences`
结果为
sequence_list available 22113233 true 11223322 false