假设我有以下表格定义:
CREATE TABLE x (i serial primary key, value integer not null);
我想计算MEDIAN value(而不是AVG)。中位数是一个将集合划分为两个包含相同数量元素的子集的值。如果元素数为偶数,则中位数是最低细分中的最大值和最大细分中的最低值的平均值。(有关更多详细信息,请参阅Wikipedia。)
value
这是我设法计算中位数的方法,但我想一定有更好的方法:
SELECT AVG(values_around_median) AS median FROM ( SELECT DISTINCT(CASE WHEN FIRST_VALUE(above) OVER w2 THEN MIN(value) OVER w3 ELSE MAX(value) OVER w2 END) AS values_around_median FROM ( SELECT LAST_VALUE(value) OVER w AS value, SUM(COUNT(*)) OVER w > (SELECT count(*)/2 FROM x) AS above FROM x GROUP BY value WINDOW w AS (ORDER BY value) ORDER BY value ) AS find_if_values_are_above_or_below_median WINDOW w2 AS (PARTITION BY above ORDER BY value DESC), w3 AS (PARTITION BY above ORDER BY value ASC) ) AS find_values_around_median
有任何想法吗?
确实,有一种更简单的方法。在Postgres中,您可以定义自己的聚合函数。不久前,我发布了函数来做中位数,模式和范围到PostgreSQL片段库。
http://wiki.postgresql.org/wiki/Aggregate_Median