聚合函数
描述
聚合函数在行之间操作值以执行数学计算,如求和、平均值、计数、最小/最大值、标准偏差和估计,以及一些非数学运算。
语法
aggregate_function(input1 [, input2, ...]) FILTER (WHERE boolean_expression)
参数
-
aggregate_function
有关Spark聚合函数的完整列表,请参阅 内置聚合函数 文档。
-
boolean_expression
指定任何计算结果为布尔类型的表达式。可以使用逻辑运算符(AND,OR)将两个或多个表达式组合在一起。
示例
请参阅 内置聚合函数 文档以获取所有Spark聚合函数的示例。
有序集合聚合函数
这些聚合函数使用与其他聚合函数不同的语法,以便指定一个表达式(通常是列名)来对值进行排序。
语法
{ PERCENTILE_CONT | PERCENTILE_DISC }(percentile) 在 组 (订单 通过 { order_by_expression [ ASC | DESC ] [ NULLS { FIRST | LAST } ] [ , ... ] }) 过滤器 (如果 boolean_expression)
参数
-
percentile
您想要查找的值的百分位数。百分位数必须是介于 0.0 和 1.0 之间的常数。
-
order_by_expression
在聚合它们之前,用于对值进行排序的表达式(通常是列名)。
-
boolean_expression
指定任何计算结果为布尔类型的表达式。两个或多个表达式可以使用逻辑运算符( AND, OR )组合在一起。
示例
CREATE OR REPLACE TEMPORARY VIEW basic_pays AS SELECT * FROM VALUES
('Diane Murphy','Accounting',8435),
('Mary Patterson','Accounting',9998),
('Jeff Firrelli','Accounting',8992),
('William Patterson','Accounting',8870),
('Gerard Bondur','Accounting',11472),
('Anthony Bow','Accounting',6627),
('Leslie Jennings','IT',8113),
(,'IT',5186),
('Julie Firrelli','Sales',9181),
('Steve Patterson','Sales',9441),
('Foon Yue Tseng','Sales',6660),
('George Vanauf','Sales',10563),
('Loui Bondur','SCM',10449),
('Gerard Hernandez','SCM',6949),
('Pamela Castillo','SCM',11303),
('Larry Bott','SCM',11798),
('Barry Jones','SCM',10586)
AS basic_pays(员工姓名, 部门, 薪资);
SELECT * FROM basic_pays;
+-----------------+----------+------+
| 员工姓名|部门|薪资|
+-----------------+----------+------+
| Anthony Bow|Accounting| 6627|
| Barry Jones| SCM| 10586|
| Diane Murphy|Accounting| 8435|
| Foon Yue Tseng| Sales| 6660|
| George Vanauf| Sales| 10563|
| Gerard Bondur|Accounting| 11472|
| Gerard Hernandez| SCM| 6949|
| Jeff Firrelli|Accounting| 8992|
| Julie Firrelli| Sales| 9181|
| Larry Bott| SCM| 11798|
| Leslie Jennings| IT| 8113|
| Leslie Thompson| IT| 5186|
| Loui Bondur| SCM| 10449|
| Mary Patterson|Accounting| 9998|
| Pamela Castillo| SCM| 11303|
| Steve Patterson| Sales| 9441|
|William Patterson|Accounting| 8870|
+-----------------+----------+------+
SELECT
部门,
percentile_cont(0.25) WITHIN GROUP (ORDER BY 薪资) AS pc1,
percentile_cont(0.25) WITHIN GROUP (ORDER BY 薪资) FILTER (WHERE 员工姓名 LIKE '%Bo%') AS pc2,
percentile_cont(0.25) WITHIN GROUP (ORDER BY 薪资 DESC) AS pc3,
percentile_cont(0.25) WITHIN GROUP (ORDER BY 薪资 DESC) FILTER (WHERE 员工姓名 LIKE '%Bo%') AS pc4,
percentile_disc(0.25) WITHIN GROUP (ORDER BY 薪资) AS pd1,
percentile_disc(0.25) WITHIN GROUP (ORDER BY 薪资) FILTER (WHERE 员工姓名 LIKE ) AS pd2,
percentile_disc(0.25) WITHIN GROUP (ORDER BY 薪资 DESC) AS pd3,
percentile_disc(0.25) WITHIN GROUP (ORDER BY 薪资 DESC) FILTER (WHERE 员工姓名 LIKE ) AS pd4
FROM basic_pays
GROUP BY 部门
ORDER BY 部门;
+----------+-------+--------+-------+--------+-----+-----+-----+-----+
|部门| pc1| pc2| pc3| pc4| pd1| pd2| pd3| pd4|
+----------+-------+--------+-------+--------+-----+-----+-----+-----+
|Accounting|8543.75| 7838.25| 9746.5|10260.75| 8435| 6627| 9998|11472|
| IT|5917.75| NULL|7381.25| NULL| 5186| NULL| 8113| NULL|
| Sales|8550.75| NULL| 9721.5| NULL| 6660| NULL|10563| NULL|
| SCM|10449.0|10786.25|11303.0|11460.75|10449|10449|11303|11798|
+----------+-------+--------+-------+--------+-----+-----+-----+-----+