这是我的表格“ tab_test”:
year animal price 2000 kittens 79 2000 kittens 93 2000 kittens 100 2000 puppies 15 2000 puppies 32 2001 kittens 31 2001 kittens 17 2001 puppies 65 2001 puppies 48 2002 kittens 84 2002 kittens 86 2002 puppies 15 2002 puppies 95 2003 kittens 62 2003 kittens 24 2003 puppies 36 2003 puppies 41 2004 kittens 65 2004 kittens 85 2004 puppies 58 2004 puppies 95 2005 kittens 45 2005 kittens 25 2005 puppies 15 2005 puppies 35 2006 kittens 50 2006 kittens 80 2006 puppies 95 2006 puppies 49 2007 kittens 40 2007 kittens 19 2007 puppies 81 2007 puppies 38 2008 kittens 37 2008 kittens 51 2008 puppies 29 2008 puppies 72 2009 kittens 84 2009 kittens 26 2009 puppies 49 2009 puppies 34 2010 kittens 75 2010 kittens 96 2010 puppies 18 2010 puppies 26 2011 kittens 35 2011 kittens 21 2011 puppies 90 2011 puppies 18 2012 kittens 12 2012 kittens 23 2012 puppies 74 2012 puppies 79
这是一些将行和列转置的代码,因此我得到了“小猫”和“小狗”的平均值:
SELECT year, AVG(CASE WHEN animal = 'kittens' THEN price END) AS "kittens", AVG(CASE WHEN animal = 'puppies' THEN price END) AS "puppies" FROM tab_test GROUP BY year ORDER BY year;
上面代码的输出是:
year kittens puppies 2000 90.6666666666667 23.5 2001 24.0 56.5 2002 85.0 55.0 2003 43.0 38.5 2004 75.0 76.5 2005 35.0 25.0 2006 65.0 72.0 2007 29.5 59.5 2008 44.0 50.5 2009 55.0 41.5 2010 85.5 22.0 2011 28.0 54.0 2012 17.5 76.5
我想要的是一个类似于第二个表的表,但是它只包含COUNT()第一个表中a至少为3的项目。换句话说,目标是将 其 作为输出:
COUNT()
year kittens 2000 90.6666666666667
第一个表中至少有3个“小猫”实例。 在PostgreSQL中可能吗?
这是@bluefeet的建议的替代方法,该建议有些相似,但避免了联接(相反,将较高级别的分组应用于已分组的结果集):
SELECT year, MAX(CASE animal WHEN 'kittens' THEN avg_price END) AS "kittens", MAX(CASE animal WHEN 'puppies' THEN avg_price END) AS "puppies" FROM ( SELECT animal, year, COUNT(*) AS cnt, AVG(Price) AS avg_price FROM tab_test GROUP BY animal, year ) s WHERE cnt >= 3 GROUP BY year ;