小编典典

BigQuery联接三个表

sql

我正在尝试在BigQuery中加入三个表;表1具有一个事件的记录(即每一行是一个记录),表2具有第二个事件的记录,表3具有类别名称。

我想生成一个最终表,该表具有按类别和设备平台划分的表1和表2的计数。但是,每次运行此命令时,我都会收到一条错误消息,指出
join.t3.category不是join中任何一个表的字段

这是我当前的代码:

Select count(distinct joined.t1.Id) as t1_events, count(distinct t2.Id) as t2_events, joined.t1.Origin as platform, joined.t3.category as category

from

(

SELECT 
        Id,
        Origin,
        CatId

    FROM [testing.table_1] as t1

JOIN (SELECT category,
            CategoryID

FROM [testing.table_3]) as t3

on t1.CatId = t3.CategoryID

) AS joined

JOIN (SELECT Id,
            CategoryId

FROM [testing.table_2]) as t2

ON (joined.t1.CatId = t2.CategoryId)

Group by platform,category;

作为参考,这是表1和2之间的一个更简单的联接,可以很好地工作:

Select count(distinct t1.Id) as t1_event, count(distinct t2.Id) as t2_events, t1.Origin as platform

from testing.table_1 as t1

JOIN testing.table_2 as t2

on t1.CatId = t2.CategoryId

Group by platform;

阅读 190

收藏
2021-04-28

共1个答案

小编典典

简单的解决方法是category在第一个内部添加字段SELECT-否则最外面不可见SELECT-因此出错!那就是问题!

另外,在BigQuery旧版SQL中,您可以使用EXACT_COUNT_DISTINCT,否则您将获得统计近似值-在COUNT([DISTINCT])中查看更多

因此,对于旧版SQL,您的查询如下所示:

SELECT
  EXACT_COUNT_DISTINCT(joined.t1.Id) AS t1_events,
  EXACT_COUNT_DISTINCT(t2.Id) AS t2_events,
  joined.t1.Origin AS platform,
  joined.t3.category AS category
FROM (
  SELECT
    Id, Origin, CatId, category
  FROM [testing.table_1] AS t1
  JOIN (SELECT category, CategoryID FROM [testing.table_3]) AS t3
  ON t1.CatId = t3.CategoryID 
) AS joined
JOIN (SELECT Id, CategoryId FROM [testing.table_2]) AS t2
ON joined.t1.CatId = t2.CategoryId
GROUP BY platform, category

而且,我觉得您可以进一步简化它(假设不会有模棱两可的字段)

SELECT
  EXACT_COUNT_DISTINCT(joined.t1.Id) AS t1_events,
  EXACT_COUNT_DISTINCT(t2.Id) AS t2_events,
  joined.t1.Origin AS platform,
  joined.t3.category AS category
FROM (
  SELECT
    Id, Origin, CatId, category
  FROM [testing.table_1] AS t1
  JOIN [testing.table_3] AS t3
  ON t1.CatId = t3.CategoryID 
) AS joined
JOIN [testing.table_2] AS t2
ON joined.t1.CatId = t2.CategoryId
GROUP BY platform, category

当然,如果您要使用标准SQL版本,也需要做同样的修复(如Elliott所建议的那样:

SELECT
  COUNT(DISTINCT joined.t1.Id) AS t1_events,
  COUNT(DISTINCT t2.Id) AS t2_events,
  joined.t1.Origin AS platform,
  joined.t3.category AS category
FROM (
  SELECT 
    Id, Origin, CatId, category
  FROM `testing.table_1` AS t1
  JOIN `testing.table_3` AS t3
  ON t1.CatId = t3.CategoryID
) AS joined
JOIN `testing.table_2` AS t2
ON joined.t1.CatId = t2.CategoryId
GROUP BY platform, category
2021-04-28