Apache Kylin : Analytical Data Warehouse for Big Data
1. Why need Job scheduler？
Table of Contents
In the process of building segment, kylin will produce a lot of tasks to be executed.
In order to coordinate the execution process of these tasks and make efficient and reasonable use of resources, job scheduling mechanism is needed.
2. What schedulers are there in kylin?
In the current kylin version (kylin v3.1.0), there are three kinds of Job Scheduler, and their implementation classes are DefaultScheduler, DistributedScheduler and CuratorScheduler.
Please refer to http://kylin.apache.org/cn/docs/install/kylin_cluster.html for configuration method.
3. What is the difference between different job schedulers?
The DefaultScheduler is the job scheduler initially used by kylin, and it is also the default job scheduler.
Once a job server holds the lock, no other job server can obtain the lock until the job server process is finished.
DistributedScheduler is a distributed scheduler contributed by Meituan, which is supported since kylin version 1.6.1.
Users can also configure the kylin.cube.schedule.assigned.servers to specify the job execution node of a cube.
Curatorscheduler is a curator based scheduler implemented by Kyligence, which is supported since kylin v3.0.0-alpha.