Components and Architecture

  1. State Machine

    State machine tells what state should the job or instance go to on a certain action. // TODO - Add details of all actions and state transitions.
  2. Notification Services

    For launching an instance scheduler needs to check if all the prerequisites for launching a new instance have been met. For example:
    1.  Is it time to launch a new instance, according to the schedule of the job?
    2. Is data available in the input locations?
    For the first version, scheduler will have only time based criteria to schedule the jobs. This is supported through AlarmService.
  3. State Store

    This component is responsible to maintain the latest state of all the jobs and their instances. For the purpose of unit testing we use an in-memory database but for production deployments we will support persistent relational databases like mysql etc. For the purpose of state management, two tables - scheduler_jobs and scheduler_job_instances are maintained. //TODO list the schema of both the tables.
  4. Core Scheduler Engine 

    This is the core of the scheduler engine. It orchestrates the other components and services. It listens to the notification services and launches new instances on the basis of that. It listens to various commands like submit/pause/delete etc., finds transitions using the State Machine, subscribes for various notification services and persists state in the state store.


There are two ways to interact with the scheduler.


    There is a REST API to interact with the scheduler. //TODO list various REST API and their responses.
  2. CLI

    Various commands to interact with the Scheduler will be exposed on the CLI as well. CLI will interact with the Scheduler through REST API. //TODO list all CLI commands

Scheduler Job Schema

Scheduler Job Instance Schema

REST API Signatures

URLDescriptionHTTP MethodAccepted Media TypesResponse TypeSample RequestSample Success ResponseSample Error ResponseComments



State Store Operations

State Store persists all the jobs and instances along with their current state in a persistent store. At the heart of the State Store is SchedulerDAO which contains all the queries to interact with the database. The two tables can be named as - scheduler_jobs and scheduler_job_instances.



StateStore should create tables if they don't already exist as part of the initialisation process.

  • createJobTableIfNotExists()
  • createJobInstanceTableIfNotExists();


Following operations need to be supported by the SchedulerDAO

  • /**
    * DAO method to insert a new job into scheduler_jobs table.
    * @param job to be inserted
    * @throws SQLException the exception
     public UUID createJob(String user, XJob job) {
    // insert a row and return the id
     UUID handleID = UUID.randomUUID();
    return handleID;

    public XJob getJob(UUID externalID) {
    return null;

    public List<SchedulerJobStats> getAllJobStats(String userName, String state, long startTime, long endTime) {
    return null;

    public SchedulerJobStats getJobStats(SchedulerJobHandle handle, String state, long startTime, long endTime) {
    return null;

    public List<SchedulerJobInstanceInfo> getJobInstances(SchedulerJobHandle jobHandle, Long numResults) {
    return null;

    public void updateJob(UUID externalID, XJob newJobDefinition) {


    public void killInstance(UUID externalID) {

    // mark the instance status as killed.


    public void deleteJob(UUID externalID) {


    public void expireJob(SchedulerJobHandle jobHandle) {


    public boolean rerunInstance(SchedulerJobInstanceHandle instanceHandle) {
    return false;




  • No labels


  1. I think following make sense for the SchedulerDAO.
    First line is the rest API. Second line is the schedulerDAO method. 

    Comments ?

    - POST : Create new job and return a jobHandle
    -> int storeJob(scheduleJobInfo) -> return number of jobs stored. 
    - GET : List all jobs, Queryparams will take filters
    -> List<ScheduleJobHandle> getAllJobs(query params)?
    - DELETE : Delete jobs.
    -> int deleteJob(query params); -> Return number of job marked to be deleted. Query param filter could be user, startdate, enddate etc.

    - GET : Get the definition
    -> ScheduledJobInfo getJob(jobHandle) -> Return the jobinfo oject. 
    - DELETE : Delete the scheduled job
    -> int deleteJob(jobHandle) -> return number of jobs marked as delete. 
    - PUT : Update schedule definition.
    -> int updateJob(jobHandle, scheduledJobInfo); -> updated scheduledJobinfo. Returns the number of jobs updated. 

    - GET : Get the details of job.
    -> getJobDetails(jobHandle) -> What is a job detail ? (Is it stats? Can we define stats?)

    - GET : Get schedule state (When created will in state NEW) : Possible values NEW, SCHEDULED, SUSPENDED, EXPIRED
    -> jobstate getJobState(handle);
    - PUT : query param as Schedule/Suspend/Resume : Calling schedule/suspend/resume again and again will be a no-op.
    -> int updateJobState(handle, newState);
    - DELETE : Expire the schedule
    -> int updateJobState(handle, EXPIRE);

    - GET : List all instances
    -> List<ScheduledJobInstaceHandle> getJobInstance(jobHandle);

    - GET Get details of an instance
    -> ScheduleInstanceInfo getJobInstanceDetails(instanceHandle);
    - POST Rerun an instance
    -> int updateJobInstanceState(instanceHandle, newstate);
    - DELETE cancel an instance
    -> int updateJobInstanceState(instanceHandle, CANCEL);


  2. Shouldn't the status be an enum rather than String In SchedulerJobInfo?

    I am making something like this in server.scheduler.api
    And changing the name to state from status.  

    public enum SchedulerJobState {
    public enum SchedulerJobInstanceState {


    Also in the SchedulerJobInstanceInfo should we put modifiedOn field?