功能概述

GROUP BY 子句是 SQL 查询中用于将结果集的行根据一组指定的分组表达式进行分组的关键组件。它允许对每个分组应用聚合函数,如 SUM()COUNT()MAX() 等,以计算分组的聚合值。Lakehouse SQL 支持高级聚合功能,包括 GROUPING SETSCUBEROLLUP 子句,这些功能使得对同一输入记录集执行多次聚合成为可能,满足复杂的数据分析需求。

语法结构

sql
SELECT ...
FROM ...  
 GROUP BY 
[ ALL |
  GROUPING SETS (group_expression) |
  CUBE(group_expression) |
  ROLLUP(group_expression)
];

参数说明

  • ALL:Lakehouse持的简写表示法,将所有不包含聚合函数的 SELECT 列表表达式作为 group_expression 添加。如果不存在此类表达式,则 GROUP BY ALL 等同于省略导致全局聚合的 GROUP BY 子句。
  • group_expression:指定行分组的条件。基于分组表达式的结果值执行行的分组。分组表达式可以是列名称、列位置或表达式。
  • grouping_set:分组集由括号中的零个或多个逗号分隔的表达式指定。如果分组集只有一个元素,则可以省略括号。
  • GROUPING SETS:对 GROUPING SETS 之后指定的每个分组集的行进行分组。此子句是 UNION ALL 的简写形式,其中 UNION ALL 运算符的每个段执行 GROUPING SETS 子句中指定的每个分组集的聚合。
  • ROLLUP:在一个语句中指定多个级别的聚合。此子句用于基于多个分组集计算聚合,是 GROUPING SETS 的速记。
  • CUBE:根据 GROUP BY 子句中指定的分组列的组合执行聚合。CUBE 是 GROUPING SETS 的速记。

使用示例

案例一 ALLGROUP BY 子句的一个选项,它将所有非聚合的 SELECT 列表表达式作为分组表达式添加。如果 SELECT 列表中没有非聚合表达式,则 GROUP BY ALL 等同于省略 GROUP BY 子句,导致全局聚合。

示例

CREATE  VIEW dealer (id, city, car_model, quantity) AS
VALUES (100, 'Fremont', 'Honda Civic', 10),
       (100, 'Fremont', 'Honda Accord', 15),
       (100, 'Fremont', 'Honda CRV', 7),
       (200, 'Dublin', 'Honda Civic', 20),
       (200, 'Dublin', 'Honda Accord', 10),
       (200, 'Dublin', 'Honda CRV', 3),
       (300, 'San Jose', 'Honda Civic', 5),
       (300, 'San Jose', 'Honda Accord', 8);
SELECT  sum(quantity) FROM dealer GROUP BY ALL;
+-----------------+
| `sum`(quantity) |
+-----------------+
| 78              |
+-----------------+
SELECT id, sum(quantity) FROM dealer GROUP BY ALL;
+-----+-----------------+
| id  | `sum`(quantity) |
+-----+-----------------+
| 200 | 33              |
| 100 | 32              |
| 300 | 13              |
+-----+-----------------+
SELECT id, city,sum(quantity) FROM dealer GROUP BY ALL;
+-----+----------+-----------------+
| id  |   city   | `sum`(quantity) |
+-----+----------+-----------------+
| 100 | Fremont  | 32              |
| 200 | Dublin   | 33              |
| 300 | San Jose | 13              |
+-----+----------+-----------------+

案例二 grouping_setGROUP BY 子句的一部分,用于指定分组集合。它可以包含单个表达式或多个表达式,用逗号分隔。

示例

-- 根据城市和车型进行分组,并计算每个分组的总量
SELECT city, car_model, sum(quantity) AS sum
    FROM dealer
    GROUP BY GROUPING SETS ((city, car_model), (city), (car_model), ())
    ORDER BY city;
+----------+--------------+-----+
|   city   |  car_model   | sum |
+----------+--------------+-----+
| null     | Honda Accord | 33  |
| null     | null         | 78  |
| null     | Honda CRV    | 10  |
| null     | Honda Civic  | 35  |
| Dublin   | Honda Accord | 10  |
| Dublin   | Honda CRV    | 3   |
| Dublin   | null         | 33  |
| Dublin   | Honda Civic  | 20  |
| Fremont  | Honda CRV    | 7   |
| Fremont  | Honda Civic  | 10  |
| Fremont  | Honda Accord | 15  |
| Fremont  | null         | 32  |
| San Jose | Honda Accord | 8   |
| San Jose | null         | 13  |
| San Jose | Honda Civic  | 5   |
+----------+--------------+-----+

案例三 ROLLUP 子句用于在单个查询中指定多个级别的聚合,它是 GROUPING SETS 的简写形式。例如,ROLLUP(city, car_model) 相当于 GROUPING SETS((city, car_model), (city), ())

示例

-- 计算每个城市每种车型的总量,以及每个城市的总量和总体总量
SELECT city, car_model, sum(quantity) AS sum FROM dealer
GROUP BY ROLLUP(city, car_model)  ;
+----------+--------------+-----+
|   city   |  car_model   | sum |
+----------+--------------+-----+
| Fremont  | Honda Accord | 15  |
| San Jose | Honda Civic  | 5   |
| Fremont  | null         | 32  |
| Dublin   | null         | 33  |
| San Jose | Honda Accord | 8   |
| San Jose | null         | 13  |
| Fremont  | Honda Civic  | 10  |
| null     | null         | 78  |
| Fremont  | Honda CRV    | 7   |
| Dublin   | Honda Civic  | 20  |
| Dublin   | Honda Accord | 10  |
| Dublin   | Honda CRV    | 3   |
+----------+--------------+-----+

案例四 CUBE 子句用于根据 GROUP BY 子句中指定的分组列的所有可能组合执行聚合,它是 GROUPING SETS 的速记形式。例如,CUBE(city, car_model) 相当于 GROUPING SETS((city, car_model), (city), (car_model), ())

示例

-- 计算每个城市每种车型的总量,每个城市的总量,每种车型的总量,以及总体总量
SELECT city, car_model, sum(quantity) AS sum FROM dealer
GROUP BY city, car_model WITH CUBE;
+----------+--------------+-----+
|   city   |  car_model   | sum |
+----------+--------------+-----+
| Fremont  | null         | 32  |
| null     | Honda CRV    | 10  |
| Dublin   | null         | 33  |
| San Jose | null         | 13  |
| Dublin   | Honda Civic  | 20  |
| San Jose | Honda Civic  | 5   |
| null     | Honda Civic  | 35  |
| Dublin   | Honda Accord | 10  |
| San Jose | Honda Accord | 8   |
| null     | Honda Accord | 33  |
| null     | null         | 78  |
| Fremont  | Honda Accord | 15  |
| Fremont  | Honda CRV    | 7   |
| Fremont  | Honda Civic  | 10  |
| Dublin   | Honda CRV    | 3   |
+----------+--------------+-----+

联系我们
预约咨询
微信咨询
电话咨询