You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 11 Next »

This document serves as a guide for the public facing Sqoop Repository API as of 1.99.5 release

This API can evolve in future releases and hence it is relevant to the state of the API in 1.99.5

Background

Sqoop2 supports a persistent store for the sqoop entities such as the Configurables ( Connector and Driver) , Configs exposed by the Connectors, Jobs and Jobruns etc. The persistent store is commonly referred to as the repository. We also expose Rest APIs and shell commands to perform CRUD operations on the sqoop entities such as connectors and drivers, connector configs related to link and job information, sqoop job and its configs. Thus the persistent store comes handy in keeping a history of the sqoop entity objects created and updated over time. In order to access the persistent store with ease, we also expose a simple java based repository API that different data stores can implement to store the sqoop entity objects.

At this point, we support relational data stores, since the entities are related to each other and expressing these relations becomes easier with a relational data store. In future it is possible to add a non-relational store to implement the repository API. Repository structure ( schema and its fields ) has also changed over time during the sqoop releases so have the APIs to retrofit to the new structures.

The rest of the document will focus on the main public facing entities and  repository APIs 

 

Sqoop Entities

Refer to this wiki for details on the Sqoop Entities. Without understanding the sqoop entities it is not worth reading further.

Sqoop Repository API 

Entity Related APIs


EntityAPINotes

CONNECTOR

  

DRIVER

  
  

JOB

  

CONFIG

  
SUBMISSION  

Repository Upgrade related APIs

  /**
   * Create or update the repository schema structures.
   *
   * This method will be called from the Sqoop server if enabled via a config
   * {@link RepoConfigurationConstants#SYSCFG_REPO_SCHEMA_IMMUTABLE} to enforce
   * changing the repository schema structure or explicitly via the
   * {@link UpgradeTool} Repository should not change its schema structure
   * outside of this method. This method must be no-op in case that the schema
   * structure do not need any upgrade.
   */
  public abstract void createOrUpgradeRepository();
  /**
   * Return true if internal repository structures exists and are suitable for use.
   * This method should return false in case that the structures do exists, but
   * are not suitable to use i.e corrupted as part of the upgrade
   *
   * @return Boolean values if internal structures are suitable for use
   */
  public abstract boolean isRepositorySuitableForUse();

 

Configurable Upgrade related APIs

( NOTE: The following apis could have been its own independent API, but it exists in the repository since the configurables config/input objects reside in the repository )

Connector Upgrade API

upgradeConnector has a default implementation provided in the Repository.java 
  /**
   * Upgrade the connector with the same {@linkplain MConnector#uniqueName}
   * in the repository with values from <code>newConnector</code>.
   * <p/>
   * All links and jobs associated with this connector will be upgraded
   * automatically.
   *
   * @param oldConnector The old connector that should be upgraded.
   * @param newConnector New properties for the Connector that should be
   *                     upgraded.
   */
  public final void upgradeConnector(MConnector oldConnector, MConnector newConnector) {
  ..}

 /**
   * Update the connector with the new data supplied in the
   * <tt>newConnector</tt>. Also Update all configs associated with this
   * connector in the repository with the configs specified in
   * <tt>mConnector</tt>. <tt>mConnector </tt> must
   * minimally have the configurableID and all required configs (including ones
   * which may not have changed). After this operation the repository is
   * guaranteed to only have the new configs specified in this object.
   *
   * @param newConnector The new data to be inserted into the repository for
   *                     this connector.
   * @param tx The repository transaction to use to push the data to the
   *           repository. If this is null, a new transaction will be created.
   *           method will not call begin, commit,
   *           rollback or close on this transaction.
   */
  public abstract void upgradeConnectorAndConfigs(MConnector newConnector, RepositoryTransaction tx);
  

 

Driver Upgrade API

upgradeDriver has a default implementation provided in the Repository.java 
 
public final void upgradeDriver(MDriver driver) {
..}
 
/**
   * Upgrade the driver with the new data supplied in the
   * <tt>mDriver</tt>. Also Update all configs associated with the driver
   * in the repository with the configs specified in
   * <tt>mDriver</tt>. <tt>mDriver </tt> must
   * minimally have the configurableID and all required configs (including ones
   * which may not have changed). After this operation the repository is
   * guaranteed to only have the new configs specified in this object.
   *
   * @param newDriver The new data to be inserted into the repository for
   *                     the driverConfig.
   * @param tx The repository transaction to use to push the data to the
   *           repository. If this is null, a new transaction will be created.
   *           method will not call begin, commit,
   *           rollback or close on this transaction.
   */
  public abstract void upgradeDriverAndConfigs(MDriver newDriver, RepositoryTransaction tx);

JdbcRepositoryHandler.java is a replica of the Repository.java class in addition to having the JDBCConnection as a parameter in the API methods.

 

Sqoop Repository Concrete Implementations

As of 1.99.5 we have Derby and Postgres implementation for the Repository

Please refer the the DerbyRepositoryHandler and PostgresqlRepositoryHandler for details

 


 


  • No labels