Sqoop 2 Actors

ActorActions Performed
Sqoop AdminsCreating LINK objects and LINK Config Inputs
Sqoop UsersCreating and Executing Sqoop JOBs
Sqoop Connector developersDeveloping Connectors and Connector related config semantics


Sqoop 2 Entities

Since 1.99.4 release we renamed a sqoop entities and added one new entity "CONFIGURABLE" that acts as a one of the core entities representing sqoop object exposing configs.  Here is the list of configs 

 

Sqoop EntityBefore 1.99.4 It was called?Sqoop Java ClassesRelationships to to other entitiesDescriptionCRUD Operations supported via command shell or REST
CONFIGURABLEN/AConfigurable.java ( abstract class)

Top Level Entity

 

Represents a core entity that exposes config objects and used in sqoop job lifecycle.

Configurable have a associated version that acts as a identifier for connector config upgrades.

MConfigurableType
/**
 * Represents the sqoop entities that can own configs
 */
public enum MConfigurableType {
  /** Connector as a owner of config keys */
  CONNECTOR,
  /** Driver as a owner of config keys */
  DRIVER;
}

READ ONLY.

 

  • /v1/configurable/driver
  • /v1/configurable/connector/[cid]
CONNECTORSameMConnector.java
  • HAS 1-n CONFIG objects
  • HAS 1-n LINK objects

is a type of configurable

There can be many connectors registered to the sqoop server

READ ONLY

Connectors and their exposed config objects are registered to the sqoop server at run-time when server starts. They are actual code artifacts packaged as jars. But they are also stored in the sqoop persistent store referred to as the repository to uniquely identify them and their config objects they expose.

Connector upgrades are also supported across sqoop releases.


DRIVER FRAMEWORKMDriver.java
  • HAS 1-n CONFIG objects

is a type of configurable

There is only one Driver object representing sqoop in the system

READ ONLY.

Driver is also registered to the sqoop server during server start time along with its associated config objects.

It also has a upgrade path similar to connectors.

CONFIGFORMMConfig.java and @Config annotation

Top Level Entity

MConfigType with supported config types are

MConfigType
public enum MConfigType {
  /** Unknown config type */
  OTHER,
  @Deprecated
  // NOTE: only exists to support the connector data upgrade path
  CONNECTION,
  /** link config type */ 
  LINK,
  /** Job config type */
  JOB;
}

READ ONLY, created once during the server start up, we do not allow update/delete via shell or REST

Note: We do not yet allow creating/deleting/editing configs at runtime via shell/ REST, and we will not probably do that ever since we want the config objects be declared in code via the @Config annotation. But Config and Inputs objects can be deleted as part of the configurable upgrade code path. Thus connector developers can delete/update it but not the sqoop users

 

INPUT ( Keys and Values )Same

MInput.java an abstract class and @Input annotation

Concrete classes for each supported types

MIntegerInput.java

MStringInput.java

  • Associated with a CONFIG object

Represents the key-value pairs for a given config.

MInputTypes supported are

MInputType
public enum MInputType {
  /** Unknown input type */
  OTHER,
  /** String input type */
  STRING,
  /** Map input type */
  MAP,
  /** Integer input type */
  INTEGER,
  /** Boolean input type */
  BOOLEAN,
  /** String based input that can contain only predefined values **/
  ENUM,
  ;
}

READ ONLY for Input Keys

Input keys are created as configs are registered. We do not allow deletes/updates via the shell/REST.

RU for Input values

Input values can be edited per config object

See SQOOP-1516 for rest apis related to config input Read/Updates per job/configId

 

 

 





LINKCONNECTION

MLink.java

MLinkConfig.java
  • Associated with a CONNECTOR
  • HAS a CONFIG-INPUT object

 

Represents the config inputs required to physically connect to the data-source a connector represents. Hence it is associated with a connector.

It has mainly one config object represented by MLinkConfig

CRUD


JOBSame

MJob.java

MFromConfig.java

MToConfig.java

MDriverConfig.java

  • HAS 3 CONFIG-INPUT objects
  • HAS 1-n SUBMISSIONS

Represents the sqoop job. It encapsulates all the required configs to run the sqoop job.

Primarily the sqoop job has the 3 main components, the FROM, TO and the DRIVER.

FROM and its related MFromConfig represent the config-inputs-values required to Extract data from the source

TO and its related MToConfig represent the config-inputs-values required to load data to the destination

DRIVER and its related MDriverConfig the config-inputs-values required by the execution engine that runs the sqoop job optimally.

 

CRUD


SUBMISSIONSameMSubmission.java 

Represents the job run details. Includes the job status, job counters and metrics from the job execution engine

 

READ ONLY


Related tickets 

Entity Renames: SQOOP-1497 && SQOOP-1498 

Rest API changes : 

SQOOP-1516 ( scheduled for 1.99.5 though )

Related Docs 

https://issues.apache.org/jira/secure/attachment/12667274/SimplifySqoopEntityNomenclature.pdf

https://issues.apache.org/jira/secure/attachment/12667576/Sqoop2.pdf

https://issues.apache.org/jira/secure/attachment/12668107/SimplifySQOOPRESTAPIs.pdf

 

  • No labels