我正在尝试在BigQuery中加入三个表;表1具有一个事件的记录(即每一行是一个记录),表2具有第二个事件的记录,表3具有类别名称。
我想生成一个最终表,该表具有按类别和设备平台划分的表1和表2的计数。但是,每次运行此命令时,我都会收到一条错误消息,指出 join.t3.category不是join中任何一个表的字段 。
这是我当前的代码:
Select count(distinct joined.t1.Id) as t1_events, count(distinct t2.Id) as t2_events, joined.t1.Origin as platform, joined.t3.category as category from ( SELECT Id, Origin, CatId FROM [testing.table_1] as t1 JOIN (SELECT category, CategoryID FROM [testing.table_3]) as t3 on t1.CatId = t3.CategoryID ) AS joined JOIN (SELECT Id, CategoryId FROM [testing.table_2]) as t2 ON (joined.t1.CatId = t2.CategoryId) Group by platform,category;
作为参考,这是表1和2之间的一个更简单的联接,可以很好地工作:
Select count(distinct t1.Id) as t1_event, count(distinct t2.Id) as t2_events, t1.Origin as platform from testing.table_1 as t1 JOIN testing.table_2 as t2 on t1.CatId = t2.CategoryId Group by platform;
简单的解决方法是category在第一个内部添加字段SELECT-否则最外面不可见SELECT-因此出错!那就是问题!
category
SELECT
另外,在BigQuery旧版SQL中,您可以使用EXACT_COUNT_DISTINCT,否则您将获得统计近似值-在COUNT([DISTINCT])中查看更多
因此,对于旧版SQL,您的查询如下所示:
SELECT EXACT_COUNT_DISTINCT(joined.t1.Id) AS t1_events, EXACT_COUNT_DISTINCT(t2.Id) AS t2_events, joined.t1.Origin AS platform, joined.t3.category AS category FROM ( SELECT Id, Origin, CatId, category FROM [testing.table_1] AS t1 JOIN (SELECT category, CategoryID FROM [testing.table_3]) AS t3 ON t1.CatId = t3.CategoryID ) AS joined JOIN (SELECT Id, CategoryId FROM [testing.table_2]) AS t2 ON joined.t1.CatId = t2.CategoryId GROUP BY platform, category
而且,我觉得您可以进一步简化它(假设不会有模棱两可的字段)
SELECT EXACT_COUNT_DISTINCT(joined.t1.Id) AS t1_events, EXACT_COUNT_DISTINCT(t2.Id) AS t2_events, joined.t1.Origin AS platform, joined.t3.category AS category FROM ( SELECT Id, Origin, CatId, category FROM [testing.table_1] AS t1 JOIN [testing.table_3] AS t3 ON t1.CatId = t3.CategoryID ) AS joined JOIN [testing.table_2] AS t2 ON joined.t1.CatId = t2.CategoryId GROUP BY platform, category
当然,如果您要使用标准SQL版本,也需要做同样的修复(如Elliott所建议的那样:
SELECT COUNT(DISTINCT joined.t1.Id) AS t1_events, COUNT(DISTINCT t2.Id) AS t2_events, joined.t1.Origin AS platform, joined.t3.category AS category FROM ( SELECT Id, Origin, CatId, category FROM `testing.table_1` AS t1 JOIN `testing.table_3` AS t3 ON t1.CatId = t3.CategoryID ) AS joined JOIN `testing.table_2` AS t2 ON joined.t1.CatId = t2.CategoryId GROUP BY platform, category