Enhanced Aggregation, Cube, Grouping and Rollup

GROUPING SETS clause

The GROUPING SETS clause in GROUP BY allows us to specify more than one GROUP BY options in the same record set. All GROUPING clause query can be logically expressed in terms of several GROUP BY queries connected by UNION. Table-1 shows several such equivalent statements. This is helpful in forming the idea of the GROUPING SETS clause. A blank set ( ) in the GROUPING SETS clause calculates the overall aggregate.

Table 1 - GROUPING SET queries and the equivalent GROUP BY queries

Aggregate Query with GROUPING SETS	Equivalent Aggregate Query with GROUP BY
SELECT a, b, SUM(c) FROM tab1 GROUP BY GROUPING SETS ( (a,b) )	SELECT a, b, SUM(c) FROM tab1 GROUP BY a, b
SELECT a, b, SUM( c ) FROM tab1 GROUP BY GROUPING SETS ( (a,b), a)	SELECT a, b, SUM( c ) FROM tab1 GROUP BY a, b UNION SELECT a, null, SUM( c ) FROM tab1 GROUP BY a
SELECT a,b, SUM( c ) FROM tab1 GROUP BY GROUPING SETS (a,b)	SELECT a, null, SUM( c ) FROM tab1 GROUP BY a UNION SELECT null, b, SUM( c ) FROM tab1 GROUP BY b
SELECT a, b, SUM( c ) FROM tab1 GROUP BY GROUPING SETS ( (a, b), a, b, ( ) )	SELECT a, b, SUM( c ) FROM tab1 GROUP BY a, b UNION SELECT a, null, SUM( c ) FROM tab1 GROUP BY a, null UNION SELECT null, b, SUM( c ) FROM tab1 GROUP BY null, b UNION SELECT null, null, SUM( c ) FROM tab1

Grouping__ID function

When aggregates are displayed for a column its value is null. This may conflict in case the column itself has some null values. There needs to be some way to identify NULL in column, which means aggregate and NULL in column, which means value. GROUPING__ID function is the solution to that.

This function returns a bitvector corresponding to whether each column is present or not. For each column, a value of "1" is produced for a row in
the result set if that column has been aggregated in that row, otherwise the value is "0". This can be used to differentiate when there are nulls
in the data.

Consider the following example:

Column1 (key)	Column2 (value)
1	NULL
1	1
2	2
3	3
3	NULL
4	5

The following query: SELECT key, value, GROUPING__ID, count from T1 GROUP BY key, value WITH ROLLUP
will have the following results.


NULL	NULL	0	6
1	NULL	1	2
1	NULL	3	1
1	1	3	1
1	NULL	1	1
2	2	3	1
3	NULL	1	2
3	NULL	3	1
3	3	3	1
4	NULL	1	1
4	5	3	1

Note that the third column is a bitvector of columns being selected.
For the first row, none of the columns are being selected.

Cubes and Rollups

The general syntax is CUBE ( ). It is used with the GROUP BY only. CUBE creates a subtotal of all possible combinations of the set of column in its argument. Once we compute a CUBE on a set of dimension, we can get answer to all possible aggregation questions on those dimensions.

It might be also worth mentioning here that
GROUP BY CUBE( a, b, c) is equivalent to
GROUP BY GROUPING SETS ( (a, b, c), (a, b), (b, c), (a, c), (a), (b), (c), ( )).

ROLLUP clause is used with GROUP BY to compute the aggregate at the hierarchy levels of a dimension.
ROLLUP(a, b, c) assumes that the hierarchy is "a" drilling down to "b" drilling down to "c".

ROLLUP (a, b, c) is equivalent to GROUPING SETS ( (a, b, c), (a, b), (a), ( )).
The general syntax of ROLLUP is ROLLUP( )

Space shortcuts

Child pages

GROUPING SETS clause

Grouping__ID function

Cubes and Rollups