Apache Kylin : Analytical Data Warehouse for Big Data

Page tree

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents



Background

Kylin will generate cuboid statistics of segments in a cube during the cube building; Besides, when optimizing and merging cubes, statistics will update too.

...

  • If Kyllin4 does not has a base cuboid for a cube, then the base cuboid "1111...111" row will be 0 and the size is 0.0 MB.
  • If a cuboid does not exist, then its children will show the shrink percentage to be "-0.0 %".
  • the command can only check a cube every time.


Example for estimate statistics

Code Block
# bin/kylin.sh org.apache.kylin.engine.mr.common.CubeStatsReader kylin_sales_cube
Retrieving hadoop conf dir...
Retrieving Spark dependency...
...
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=512M; support was removed in 8.0
...
============================================================================
Statistics of kylin_sales_cube[20120101000000_20130201000000]

Total cuboids: 160
Total precise rows: 828158
Total precise size(MB): 30.2807559967041
Sampling percentage:  100
Mapper overlap ratio: -1.0
Mapper number: -1
Length of dimension DEFAULT.KYLIN_SALES.BUYER_ID is 4
Length of dimension DEFAULT.KYLIN_SALES.SELLER_ID is 4
Length of dimension DEFAULT.KYLIN_SALES.TRANS_ID is 4
Length of dimension DEFAULT.KYLIN_SALES.PART_DT is 3
...
|---- Cuboid 111111111111111111, est row: 5402, precise MB: 0.22
    |---- Cuboid 000111111111111111, est row: 5402, est MB: 0.22, shrink: 100%
        |---- Cuboid 000101111111111111, est row: 5402, est MB: 0.21, shrink: 100%
            |---- Cuboid 000101101111111111, est row: 5402, est MB: 0.2, shrink: 100%
                |---- Cuboid 000101001111111111, est row: 5402, est MB: 0.19, shrink: 100%


Example for precise statistics

Code Block
# bin/kylin.sh org.apache.kylin.engine.mr.common.CubeStatsReader kylin_sales_cube
Retrieving hadoop conf dir...
Retrieving Spark dependency...
...
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=512M; support was removed in 8.0
...
============================================================================
Statistics of kylin_sales_cube[20120101000000_20130201000000]

Total cuboids: 160
Total precise rows: 828158
Total precise size(MB): 30.2807559967041
Sampling percentage:  100
Mapper overlap ratio: -1.0
Mapper number: -1
Length of dimension DEFAULT.KYLIN_SALES.BUYER_ID is 4
Length of dimension DEFAULT.KYLIN_SALES.SELLER_ID is 4
Length of dimension DEFAULT.KYLIN_SALES.TRANS_ID is 4
Length of dimension DEFAULT.KYLIN_SALES.PART_DT is 3
...
|---- Cuboid 111111111111111111, precise row: 0, precise MB: 0
    |---- Cuboid 000111111111111111, precise row: 5402, precise MB: 0.22, shrink: -0%
        |---- Cuboid 000101111111111111, precise row: 5402, precise MB: 0.21, shrink: 100%
            |---- Cuboid 000101101111111111, precise row: 5402, precise MB: 0.2, shrink: 100%
                |---- Cuboid 000101001111111111, precise row: 5402, precise MB: 0.19, shrink: 100%

...