我是Athena的新手,我试图了解如何将多列从长格式转换为宽格式。似乎presto需要什么,但我只能成功地将其应用于map_agg一个变量。我认为我的低于最终的结果可以实现,multimap_agg但不能完全起作用。
presto
map_agg
multimap_agg
下面我将逐步介绍我的步骤和数据。如果您有任何建议或疑问,请告诉我!
首先,数据开始如下:
id | letter | number | value ------------------------------------ 123 | a | 1 | 62 123 | a | 2 | 38 123 | a | 3 | 44 123 | b | 1 | 74 123 | b | 2 | 91 123 | b | 3 | 97 123 | c | 1 | 38 123 | c | 2 | 98 123 | c | 3 | 22 456 | a | 1 | 99 456 | a | 2 | 33 456 | a | 3 | 81 456 | b | 1 | 34 456 | b | 2 | 79 456 | b | 3 | 43 456 | c | 1 | 86 456 | c | 2 | 60 456 | c | 3 | 59
然后,我使用该where子句进行过滤,然后将数据转换为以下数据joining:
where
joining
id | letter | 1 | 2 | 3 ---------------------------- 123 | a | 62 | 38 | 44 123 | b | 74 | 91 | 97 123 | c | 38 | 98 | 22 456 | a | 99 | 33 | 81 456 | b | 34 | 79 | 43 456 | c | 86 | 60 | 59
为了获得最终结果,我想将其转换为以下内容:
id | a_1 | a_2 | a_3 | b_1 | b_2 | b_3 | c_1 | c_2 | c_3 -------------------------------------------------------------------------- 123 | 62 | 38 | 44 | 74 | 91 | 97 | 38 | 98 | 22 456 | 99 | 33 | 81 | 34 | 79 | 43 | 86 | 60 | 59
您可以使用窗口函数和条件聚合。这要求您事先知道可能的字母,以及每个id /字母元组的最大行数:
select id, max(case when letter = 'a' and rn = 1 then value end) a_1, max(case when letter = 'a' and rn = 2 then value end) a_2, max(case when letter = 'a' and rn = 3 then value end) a_3, max(case when letter = 'b' and rn = 1 then value end) b_1, max(case when letter = 'b' and rn = 2 then value end) b_2, max(case when letter = 'b' and rn = 3 then value end) b_3, max(case when letter = 'c' and rn = 1 then value end) c_1, max(case when letter = 'c' and rn = 2 then value end) c_2, max(case when letter = 'c' and rn = 3 then value end) c_3 from ( select t.*, row_number() over(partition by id, letter order by number) rn from mytable t ) t group by id
实际上,如果numbers为始终1,2,3,那么你甚至不需要窗口函数:
number
1
2
3
select id, max(case when letter = 'a' and number = 1 then value end) a_1, max(case when letter = 'a' and number = 2 then value end) a_2, max(case when letter = 'a' and number = 3 then value end) a_3, max(case when letter = 'b' and number = 1 then value end) b_1, max(case when letter = 'b' and number = 2 then value end) b_2, max(case when letter = 'b' and number = 3 then value end) b_3, max(case when letter = 'c' and number = 1 then value end) c_1, max(case when letter = 'c' and number = 2 then value end) c_2, max(case when letter = 'c' and number = 3 then value end) c_3 from mytable t group by id