Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Added OMRS Notifications

...

The Open Metadata Repository Services (OMRS) provide the means for different metadata repositories to exchange metadata.  It breaks the mould of traditional metadata repositories by management technology that tend to centralize metadata by instead assuming that metadata is going to be distributed amongst a number of metadata repositories.  These repositories may all be instances of Apache Atlas, or it they may include a mixture of repositories from different vendors.

The open metadata philosophy is that metadata should be managed as close to its source as possible but it should be accessible through standard open APIs and notifications.  This means new metadata is entities are created in the repository that is connected to the tools a person uses, the engine processing data or a specific group of data sources.  So for example, an organization may have:

...

Whichever repository is used to create a metadata entity, it has the master copy of that metadata entity and all updates to this metadata should be done through this repository.  So part of the responsibility of the OMRS is to ensure updates happen in the right metadata repository.

...

The integration between metadata repositories needs to be flexible to support different non-functional requirements.  For example, where metadata is changing rapidly (such as in a data lake), this metadata should be dynamically queried from its master metadata repository because the rate of updates mean it would cost a lot of network traffic to keep a copy of this metadata up to date.  On the other hand, governance classifications (such as confidentially) rarely change.  They are often administered centrally by the governance team and then linked to all metadata that describes the organization's data resources.  The data resources metadata is typically distributed across many metadata repositories.  The engines that maintain the data resources use the classifications in real-time as they are processing the data.  Thus, it makes sense to create copies of the classifications metadata entities in all metadata repositoriesThese copies are called reference copies of the metadata entities and they are read-only.  The OMRS automatically synchronizes any updates made to the master copy with all of its reference copies.

As such the OMRS covers integration enables the integration of metadata that is distributed amongst a number of metadata repositories either:

  • Through a call interface which is provided by an OMRS connector
  • Using notifications that broadcast changes to metadata in a repository that other repositories can subscribe to in order to maintain reference copies of specific metadata entities
  • Via linked data URLs that enable a metadata entity to have a relationship with a metadata entity in a different repository.

...

The OMRS can be configured at the metadata repository level to control:

  • which metadata is replicated to each metadata repository and
  • whenever a query is made, to which metadata repositories the query is directed.

 

...

OMRS Connectors

The OMRS uses connectors that support the Open Connector Framework (OCF) to provide a call interface to the metadata repositories.

The OMRS Connector API is a standard interface for a connector to a metadata repository.  We This enables the services that call OMRS (such as the Open Metadata Access Services (OMAS)) to interact with 1 or many metadata repositories through the same interface.  The connection configuration it passes to the OCF determines which type of OMRS connector is returned by the OCF.  

Initially we plan 4 implementations of this connectors with the OMRS Connector API:

  • Local Atlas OMRS Connector – this is the connector to a local Apache Atlas metadata repository and implemented in
    Jira
    serverASF JIRA
    serverId5aa69414-a9e9-3523-82ec-879b028fb15b
    keyATLAS-1773
    .
  • OMRS REST Connector – this is a connector to a remote Apache Atlas repository (or any other metadata repository that supports the OMRS REST APIs) - also implemented in
    Jira
    serverASF JIRA
    serverId5aa69414-a9e9-3523-82ec-879b028fb15b
    keyATLAS-1773
    .
  • IGC OMRS Connector – this is the connector for IBM’s Information Governance Catalog - implemented in
    Jira
    serverASF JIRA
    serverId5aa69414-a9e9-3523-82ec-879b028fb15b
    keyATLAS-1774
    .
  • Enterprise OMRS Connector – this connector can federate multiple metadata repositories by aggregating the results of calls to their OMRS connectors - implemented in
    Jira
    serverASF JIRA
    serverId5aa69414-a9e9-3523-82ec-879b028fb15b
    keyATLAS-1775
    .

OMAS APIs will use the OMRS Connector API to access metadata.  The name of the metadata repository connection they use (set in the OMAS Scope) will determine which implementation of the connector is used and hence which metadata repository/ repositories are called.  If no metadata repository is specified, it uses the local connector.

Figure 2 illustrates these OMRS connectors in action:.  Since OMRS connectors are Java, those that are committed into the Atlas build can be hosted directly in the Atlas server.  Additional connectors can be built by anyone and these can be assessed through the OMRS RES interface.

Figure 2: Using the OMRS Connectors to create a metadata cluster

 

The notes below correspond to the numbers on the diagram in figure 2.

  1. There are many OMAS APIs, each designed for a different type of consumer.  Each OMAS API has a simple, bean-like Java interface for applications to call (or the application may use the OMAS REST API directly).
  2. The OMAS REST APIs provide JSON versions of the objects supported in the OMAS APIs.  The OMAS Java Server classes implement the REST APIs.  Each Java Server Class uses the connector broker to acquire an OMRS Connector.
  3. The Local Atlas OMRS Connector provides access to the local metadata repository.
  4. This local repository includes a graphDB accessed through TinkerPop.
  5. The Enterprise OMRS is able to aggregate metadata from multiple metadata repositories in response to a single request for metadata.  It does this by replicating the request across multiple downstream OMRS Connectors and then aggregating and correlating the results.
  6. The Enterprise OMRS Connector is typically configured to call the local Atlas Repository for creates.  Updates go to the repository where the metadata entity was created.
  7. In addition, the Enterprise OMRS may be configured to call other metadata repositories using OMRS Connector implementations that have been built into the Atlas runtime.  In this example,  (7) shows an OMRS connector that can call IBM’s Information Governance Catalog (IGC).
  8. The IGC OMRS Connector will translate OMRS calls into IGC REST calls.
  9. For calls to remote Atlas Servers, and other servers that have OMRS connectors that are not included into the Atlas runtime, the Enterprise OMRS Connector will use the OMRS REST Connector.  This connector translates OMRS connector requests into remote OMRS REST calls. 
  10. The OMRS REST Connector may also be called from the OMAS Java Server classes to pass-through metadata requests to a remote metadata server.
  11. Every Atlas runtime supports the OMRS REST APIs.
  12. When an Atlas runtime received an OMRS REST call, it is passed to the local Atlas OMRS connector and is executed against the local metadata repository.
  13. The OMRS REST Connector can be used to connect to an adapter for other types of metadata repositories.  
  14. The adapter would host an OMRS Connector to the metadata repository.  This provides a solution for OMRS Connector implementations that are not (yet) integrated into the Atlas build.  The downside of this approach is the additional network hop that the adapter introduces.
  15. An OMRS REST Connector can be incorporated into other metadata repositories to enable them to query Atlas metadata.  Typically the repository will use the Enterprise OMRS Connector to ensure it is reaching as much metadata as possible
  16. This includes IBM’s Information Governance Catalog (IGC).

 


OMRS Notifications

OMRS supports metadata synchronization between metadata repositories through a messaging infrastructure.  Kafka is used in the current implementation but over time, this will be pluggable.  Figure 3 shows the flows of messages in an open metadata ecosystem.

Image Added

Figure 3: OMRS notifications for synchronizing metadata between metadata repositories

 

 

 

 

 

...

OMRS Linking

TBD

 

 

...