这个问题已经在这里有了答案 :
如何在BigQuery中旋转数据表 (6个答案)
8个月前关闭。
我有一个如下表:
| user_id | product_purchased | ------------------------------- | 111 | A | | 111 | B | | 222 | B | | 222 | B | | 333 | C | | 444 | A |
我想透视表以将用户ID作为行,将用户购买的每种产品的计数作为列。因此,对于上表,它看起来像:
| user_id | product A | product B | product C | ----------------------------------------------- | 111 | 1 | 1 | 0 | | 222 | 0 | 2 | 0 | | 333 | 0 | 0 | 1 | | 444 | 1 | 0 | 0 |
我知道可以使用countif语句手动完成此操作:
#standardsql select user_id, countif(product_purchased = 'A') as 'A', countif(product_purchased = 'B') as 'B', etc, group by user_id
但是,实际上,该表有太多可能的产品,无法手动将所有选项写出。有没有办法以一种更加自动化和优雅的方式来实现这一点?
实际上,该表有太多可能的产品,无法手动将所有选项写出
以下是BigQuery标准SQL
您可以分两个步骤进行操作-首先通过在下面运行以下命令来准备动态数据透视查询
#standardSQL SELECT CONCAT('SELECT user_id, ', STRING_AGG( CONCAT('COUNTIF(product_purchased = "', product_purchased, '") AS product_', product_purchased) ), ' FROM `project.dataset.your_table` GROUP BY user_id') FROM ( SELECT product_purchased FROM `project.dataset.your_table` GROUP BY product_purchased )
结果,您将获得表示需要运行以获取所需结果的查询的字符串
例如,如果要应用于您问题中的虚拟数据
#standardSQL WITH `project.dataset.your_table` AS ( SELECT 111 user_id, 'A' product_purchased UNION ALL SELECT 111, 'B' UNION ALL SELECT 222, 'B' UNION ALL SELECT 222, 'B' UNION ALL SELECT 333, 'C' UNION ALL SELECT 444, 'A' ) SELECT CONCAT('SELECT user_id, ', STRING_AGG( CONCAT('COUNTIF(product_purchased = "', product_purchased, '") AS product_', product_purchased) ), ' FROM `project.dataset.your_table` GROUP BY user_id') FROM ( SELECT product_purchased FROM `project.dataset.your_table` GROUP BY product_purchased )
您将获得以下查询(在此处设置格式以便于查看)
SELECT user_id, COUNTIF(product_purchased = "A") AS product_A, COUNTIF(product_purchased = "B") AS product_B, COUNTIF(product_purchased = "C") AS product_C FROM `project.dataset.your_table` GROUP BY user_id
现在,您只需运行此命令即可获得所需的结果,而无需手动编码
同样,如果要对您的问题中的伪数据运行它
#standardSQL WITH `project.dataset.your_table` AS ( SELECT 111 user_id, 'A' product_purchased UNION ALL SELECT 111, 'B' UNION ALL SELECT 222, 'B' UNION ALL SELECT 222, 'B' UNION ALL SELECT 333, 'C' UNION ALL SELECT 444, 'A' ) SELECT user_id, COUNTIF(product_purchased = "A") AS product_A, COUNTIF(product_purchased = "B") AS product_B, COUNTIF(product_purchased = "C") AS product_C FROM `project.dataset.your_table` GROUP BY user_id -- ORDER BY user_id
您得到预期的结果
Row user_id product_A product_B product_C 1 111 1 1 0 2 222 0 2 0 3 333 0 0 1 4 444 1 0 0
有没有办法以一种更加自动化和优雅的方式来实现这一点?
您可以使用任意client选择 轻松地自动进行上述操作
client