Apache Kylin : Analytical Data Warehouse for Big Data
Welcome to Kylin Wiki.
Due to some limitations, kylin 4.0 can not support the first step of the second stage of cube planner, which is to automatically recommend recommend cube list; However, in order to enable users to adjust the cube more flexibly according to business scenarios, kylin 4.0 supports the second step of the second phase of the cube planner provides the ability to allow users to manually adjust the cube list. If you want to delete or add the cuboids for the specified cube, you can update the cuboid list by calling rest API(http://host:port/kylin/api/cubes/
) . After updating the cube list, kylin 4.0 will generate the corresponding for each segment in the cube to update the cuboid list. Finally, an Optimize Checkpoint Job is used to uniformly update the cube metadata and clean up the garbage.step1、Calculate the cuboid id you want to add/delete according to the dimension
Cuboid Id is determined according to the order of dimension in Rowkey. For example, for the following rowkey columns:
The binary value of the corresponding base cuboid id is that the corresponding position of each rowkey column is 1, that is, 111111111, and the decimal cuboid id is 262143。
When you want to build a cuboid of PART_DT and BUYER_ID, then the corresponding positions of PART_DT and BUYER_ID are 1, other positions are 0 in Rowkey column, binary cuboid id is 1001000000000000000, and decimal cuboid ID is 147456。
step2、Call the REST API to pass in the cuboid id you want to add/delete
REST API:PUT http://host:port/kylin/api/cubes/{CubeName}/optimize2 Request Body: { "cuboidsAdd":["cuboidId1","cuboidId2"], "cuboidsDelete":["cuboidId3","cuboidId4"] }
After passing in the cuboid id to be added/deleted through the REST API, kylin 4.0 will generate the corresponding recommend cuboid list, optimize segment job and optimize checkpoint job, as shown in the following figure:
Because the cube in the example has two built segments, so two optimize cube jobs and one optimize checkpoint job are generated.
OPTIMIZE CUBE JOB:
OPTIMIZE CHECKPOINT JOB:
After the optimize checkpoint job is completed, the cuboid list update is completed.