Apache Kylin : Analytical Data Warehouse for Big Data

Page tree

Welcome to Kylin Wiki.

History

This improvement is avaliable after v3.1.0.

Background

There were some use cases for this feature:

Case 1 Force choose the right hit cube

In many OLAP team, we choose to build some smaller cubes other than SINGLE larger cube to reduce build duration/cube storage. For example, we build three small cubes: the first cube with three dimensions "ABC", the second cube with three dimensions "ADE", and the third cube with five dimensions “ADHGF", rather than a bigger cube with eight dimensions "ABCDEHGF". We can see, because of the removal of cuboid "ABCDEHGF", our design will reduce total storage a lot in theory (however it depends on specific use scenarios).

After that, the design we choose will cause new question. UserA create and build Cube1(with three dimensions A,B,C) from 2020-01-07 to 2020-01-10, and UserB create and build Cube2 (with five dimensions A,D,H,G,F) from 2020.01.05 to 2020-01-10.

When UserB querying "select A,count(*) from db.table group by A; " , this query will hit Cube1 automatically(but UserB want query hit Cube2), because of less dimension/measure. So that result from 01.05 to 01.07 will disappear. To fix this problem, we have to force choose Cube2 to answer this query.

Case 2 For testing and debug purpose

We usually clone new cube from existing one, make some changes(maybe add some new configuration) and then build some new segment for testing new added feature. But it will cause cube conflict when two cube both become READY, thus leads to wrong online results(maybe misleads QA team).

How to

Web UI

In Insight page, select the correct cube name in ComboBox before click "Submit" button.

image

Rest API

Specify the correct cube name in request body, the following example use SimpleCube_01 as an example. Please refer to http://kylin.apache.org/docs/howto/howto_use_restapi.html#query for complete documentation .

Request
{  
   "sql":"select part_dt, count(*) from KYLIN_SALES group by part_dt order by part_dt",
   "offset":0,
   "limit":50000,
   "acceptPartial":false,
   "project":"learn_kylin",
   "backdoorToggles":{"DEBUG_TOGGLE_HIT_CUBE":"SimpleCube_01"}
}

Following screenshot is a example.

image

JDBC Driver

Starting from Kylin 4.0.2, cube can be specified by sql hint .

Put the cube name you want to specify in SQL as follows:

select /*+ MODEL_PRIORITY(cube_name) */ from ....

For example:

When using SQL hint and other methods to specify the cube query at the same time, the cube specified by SQL hint is preferred.

Reference


  • No labels