You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 8 Next »

  • Motivation:
    • current way of managing job config via files makes it hard to manage job configurations, persisting it in MySQL would make it easier to manage and also provide various CRUD API interfaces to mange the job configuration.
  • Proposed Change
    • create MySQL table `gobblin_job_config`
    • create REST endpoints for CRUD operations
    • add functionality to detect change in any job config and schedule or trigger job accordingly ( similar to current mechanism based on file change )
    • In future, Gobblin should use JDBI (v3) as ORM instead of plan JDBC based persistence which will become cumbersome as it uses DB heavily for all kinds of metadata storage.
  • New or Changed Public Interfaces:
    • POST /job/{jobName}
    • PUT /job//{jobName}
    • GET /job/{jobName}
    • DELETE /job/{jobName}
  • Migration Plan and Compatibility
    • since this is new functionality, in addition to the current file based job config, user can select either one to manage the job.
  • Rejected Alternatives:
    • use of other REST framework ( preferring to use REST.li instead of dropwizard or any other as its built in framework for other APIs

Design:


  • MySQL DDL: ( V1_0_4__job_store.sql )

gobblin_job_config
CREATE TABLE gobblin_job
(
    id            int(11)      NOT NULL AUTO_INCREMENT,
    name          varchar(255) NOT NULL,
    description   varchar(255)          DEFAULT NULL,
    schedule      varchar(64)           DEFAULT NULL,
    is_disabled   tinyint(1)            DEFAULT '0',
    priority      smallint              DEFAULT 100,
    configs       text,
    owner_email   varchar(255) NOT NULL,
    source_system varchar(255) NOT NULL,
    target_system varchar(255) NOT NULL,
    job_template  VARCHAR(255) NULL,
    created_date  timestamp    NOT NULL DEFAULT CURRENT_TIMESTAMP,
    updated_date  timestamp    NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
    PRIMARY KEY (name),
    INDEX (name),
    INDEX (id)
) AUTO_INCREMENT = 1
  ENGINE = InnoDB
  DEFAULT CHARSET = utf8;


CREATE TABLE gobblin_job_queue
(
    queue_id     varchar(255) NOT NULL,
    job_name     varchar(255) NOT NULL,
    configs      text,
    status       varchar(255) NOT NULL,
    job_id       varchar(255),
    created_date timestamp    NOT NULL DEFAULT CURRENT_TIMESTAMP,
    updated_date timestamp    NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
    PRIMARY KEY (queue_id)
) ENGINE = InnoDB
  DEFAULT CHARSET = utf8;


CREATE TABLE `gobblin_job_template`
(
    `name`          varchar(255) NOT NULL DEFAULT '',
    `source_system` varchar(255)          DEFAULT NULL,
    `target_system` varchar(255)          DEFAULT NULL,
    `configs` text NOT NULL,
    PRIMARY KEY (`name`)
) ENGINE = InnoDB
  DEFAULT CHARSET = utf8;

CREATE TABLE gobblin_job_owner
(
    name         varchar(255) NOT NULL,
    email        varchar(255) NOT NULL,
    team_name    int(11)               DEFAULT NULL,
    team_email   int(11)               DEFAULT NULL,
    org_name     int(11)               DEFAULT NULL,
    created_date timestamp    NOT NULL DEFAULT CURRENT_TIMESTAMP,
    updated_date timestamp    NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
    PRIMARY KEY (email)
) ENGINE = InnoDB
  DEFAULT CHARSET = utf8;



  • JOB MGMT. CRUD REST API


  1. Create New job example

POST body
METHOD: POST 
URL:	http://localhost:9090/job/PullFromWikipedia

BODY:
{
	"jobName": "PullFromWikipedia",
	"jobDescription": "A getting started example for Gobblin",
	"jobDisabled": false,
	"jobProperties": {
		"job.group": "Wikipedia",
		"source.class": "org.apache.gobblin.example.wikipedia.WikipediaSource",
		"test_remove": "test_remove_value",
		"something": "asdsad"
	}
}
RESPONSE: 201 - created


2. update existing job

POST body
METHOD: PUT 
URL:    http://localhost:9090/job/PullFromWikipedia

BODY:
{
    "name": "PullFromWikipedia",
    "description": "A getting started example for Gobblin - pull some pages from wikipedia",
    "disabled": true,
    "schedule": "",
    "priority": 98,
    "configs": {
        "job.group": "Wikipedia",
        "source.class": "org.apache.gobblin.example.wikipedia.WikipediaSource",
        "test_add": "test_remove_value",
        "something": "asdsad"
    },
    "ownerEmail": "jsenjaliya@paypal.com",
    "sourceSystem": "simba",
    "targetSystem": "RADD-LVS",
    "configsToRemove": [
        "test_remove"
    ]
}

RESPONSE: 
200 - OK



3. GET existing job

GET Result
METHOD: GET 
URL: 	http://localhost:9090/job/PullFromWikipedia

BODY: None

RESPONSE: 
{
    "schedule": "run_once",
    "configs": {
        "job.group": "Wikipedia",
        "source.class": "org.apache.gobblin.example.wikipedia.WikipediaSource",
        "something": "asdsad",
        "test_add": "test_remove_value"
    },
    "createdDate": 1587770060000,
    "sourceSystem": "simba",
    "name": "PullFromWikipedia",
    "description": "A getting started example for Gobblin - pull some pages from wikipedia",
    "disabled": true,
    "id": 1,
    "updatedDate": 1587774853000,
    "priority": 98,
    "targetSystem": "RADD-LVS",
    "ownerEmail": "jsenjaliya@paypal.com"
}




  • No labels