Child pages
  • Sqoop Repository API
Skip to end of metadata
Go to start of metadata



This document serves as a guide for the public facing Sqoop Repository API as of 1.99.5 release

This API can evolve in future releases and hence it is relevant to the state of the API in 1.99.5


Sqoop2 supports a persistent store for the sqoop entities such as the Configurables ( Connector and Driver) , Configs exposed by the Connectors, Jobs and JobRuns/Submissions etc. The persistent store is commonly referred to as the repository. We also expose Rest APIs and shell commands to perform CRUD operations on the sqoop entities such as connectors and drivers, connector configs related to link and job information, sqoop job and its configs. Thus the persistent store comes handy in keeping a history of the sqoop entity objects created and updated over time. In order to access the persistent store with ease, we also expose a simple java based repository API that different data stores can implement to store the sqoop entity objects.

At this point, we support relational data stores, since the entities are related to each other and expressing these relations becomes easier with a relational data store. In future it is possible to add a non-relational store to implement the repository API. Repository structure ( schema and its fields ) has also changed over time during the sqoop releases so have the APIs to retrofit to the new structures.

The rest of the document will focus on the main public facing entities and  repository APIs 


Sqoop Entities

Refer to this wiki for details on the Sqoop Entities. Without understanding the sqoop entities it is not worth reading further.

Sqoop Repository API 

Details and Javadocs are available in ( The trunk of sqoop2). Here are the high level details on the the APIs

Entity Related APIs



public abstract MConnector registerConnector(MConnector mConnector, boolean autoUpgrade);

public abstract MConnector findConnector(String shortName);
public abstract List<MConnector> findConnectors();




public abstract MDriver registerDriver(MDriver mDriverConfig, boolean autoUpgrade);
public abstract MDriver findDriver(String shortName);


public abstract void createLink(MLink link);
public abstract void updateLink(MLink link);
public abstract void updateLink(final MLink link, RepositoryTransaction tx);
public abstract void enableLink(long id, boolean enabled);
public abstract void deleteLink(long id);
public abstract MLink findLink(long id);

public abstract MLink findLink(String name);
public abstract List<MLink> findLinksForConnector(long connectorId);
public abstract List<MLink> findLinks();


public abstract void createJob(MJob job);
public abstract void updateJob(MJob job);
public abstract void updateJob(MJob job, RepositoryTransaction tx);
public abstract void enableJob(long id, boolean enabled);
public abstract void deleteJob(long id);
public abstract MJob findJob(long id);
public abstract MJob findJob(String name);
public abstract List<MJob> findJobs();
public abstract List<MJob> findJobsForConnector(long connectorId);




public abstract void deleteJobInputs(long jobId, RepositoryTransaction tx);
public abstract void deleteLinkInputs(long linkId, RepositoryTransaction tx);

No Public API for users yet

See SQOP-1516 for more details - 1.99.5 changes got Input RU


Input deletion can happen as part of the connector/driver upgrade path

public abstract void createSubmission(MSubmission submission);
public abstract void updateSubmission(MSubmission submission);
public abstract void purgeSubmissions(Date threshold);
public abstract List<MSubmission> findUnfinishedSubmissions();
public abstract List<MSubmission> findSubmissions();
public abstract List<MSubmission> findSubmissionsForJob(long jobId);
public abstract MSubmission findLastSubmissionForJob(long jobId);

CUD for internal sqoop use only

READ ONLY APIs for user

Repository Upgrade related APIs


Configurable Upgrade related APIs

( NOTE: The following apis could have been its own independent API, but it exists in the repository since the configurables config/input objects reside in the repository )

Connector Upgrade API

upgradeConnector has a default implementation provided in the 


Driver Upgrade API

upgradeDriver has a default implementation provided in the 



Sqoop Repository Concrete Implementations

JdbcRepository extends Repository API is a replica of the class in addition to having the "java.sql.Connection" as a parameter in the API methods.

As of 1.99.5 we have Derby and Postgres implementation for the Repository

Please refer the the DerbyRepositoryHandler and PostgresqlRepositoryHandler for details. They are concrete implementations of  the JdbcRepositoryHandler




  • No labels