我有一个具有以下结构的行表,name TEXT, favorite_colors TEXT[], group_name INTEGER其中每一行都有每个人喜欢的颜色和该人所属组的列表。我如何GROUP BY group_name返回每个组中最常见的颜色的列表?
name TEXT, favorite_colors TEXT[], group_name INTEGER
GROUP BY group_name
您可以组合int[] && int[]设置重叠,int[] & int[]获取交点然后进行其他计数和排名吗?
int[] && int[]
int[] & int[]
快速而肮脏:
SELECT group_name, color, count(*) AS ct FROM ( SELECT group_name, unnest(favorite_colors) AS color FROM tbl ) sub GROUP BY 1,2 ORDER BY 1,3 DESC;
JOIN`](http://www.postgresql.org/docs/current/interactive/sql-select.html)
在Postgres 9.3或更高版本中, 这是更简洁的形式:
SELECT group_name, color, count(*) AS ct FROM tbl t, unnest(t.favorite_colors) AS color GROUP BY 1,2 ORDER BY 1,3 DESC;
以上是的简写
... FROM tbl t JOIN LATERAL unnest(t.favorite_colors) AS color ON TRUE ...
像其他查询一样INNER JOIN,它会排除没有颜色(favorite_colors IS NULL)的行-与第一个查询一样。
INNER JOIN
favorite_colors IS NULL
要在结果中 包括 此类行,请改用:
SELECT group_name, color, count(*) AS ct FROM tbl t LEFT JOIN LATERAL unnest(t.favorite_colors) AS color ON TRUE GROUP BY 1,2 ORDER BY 1,3 DESC;
您可以在下一步中轻松汇总每个组的“最常用颜色”,但是您需要首先定义“最常用颜色” …
根据评论,选择> 3次出现的颜色。
SELECT t.group_name, color, count(*) AS ct FROM tbl t, unnest(t.favorite_colors) AS color GROUP BY 1,2 HAVING count(*) > 3 ORDER BY 1,3 DESC;
汇总数组中的顶部颜色(降序排列):
SELECT group_name, array_agg(color) AS top_colors FROM ( SELECT group_name, color FROM tbl t, unnest(t.favorite_colors) AS color GROUP BY 1,2 HAVING count(*) > 3 ORDER BY 1, count(*) DESC ) sub GROUP BY 1;
- > SQLfiddle演示所有内容。