Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Added ecosystem roles

...

Open metadata and governance is a moon-shot type of project to create a an open set of APIs, types and interchange protocols to allow all metadata repositories to share and exchange metadata.  From this common base, it adds governance, discovery and access frameworks to automate the collection, management and use of metadata across an enterprise.  The result is an enterprise catalog of data resources that are transparently assessed, governed and used in order to deliver maximum value to the enterprise.

...

The proposal is to use Apache Atlas as the open source reference implementation for open metadata and governance.  Apache Alas will would support an open metadata and governance compliant repository plus provide the adapters and interchange formats to allow other metadata repositories to connect into the ecosystem.

 

...

 

Image Added

Figure 1: Open Metadata and Governance Components implemented in Apache Atlas

The open metadata and governance project is divided into the following pieces:

...

  • these types are built from the Apache Atlas type system and define the types stored in the grpah data base as well as payloads for notifications and APIs
  • Open Metadata Repository Services (OMRS) - Open metadata repository APIs and notifications to enable metadata repositories to

...

  • exchange metadata in a peer-to-peer metadata repository cluster.
  • Open Metadata Access Services (OMAS) - Consumer-centric APIs and notifications for specific classes of tools and applications.  The OMAS services call the OMRS to access metadata from any open metadata repository.
  • New frameworks to complement the Atlas Hooks and Bridges
    • Open Connector Framework (OCF) - provides factories for connectors with access APIs for data resources and metadata together.  The OMRS is also built as a set of metadata repository connectors and the OMAS services use the OCF to connect to the appropriate OMRS connector.
    • Open Discovery Framework (ODF) - provides management for automated processes and analytics to analyze the content of data resources and update the metadata about this
    • Governance Action Framework (GAF) - provides governance enforcement services for implementing enforcement points in data engines, security managers such as Apache Ranger and APIs.
  • A set of stores to complement the graph database at the heart of the open metadata and governance service.  These stores provide detailed logs and related information that is linked to by the graph but with cost efficient formats and access mechanisms.

Figure 1 show how this could look if it were implemented in Apache Atlas.

 

With these frameworks and APIs in place, Apache Atlas would be the reference implementation for the open metadata and governance APIs as well as offering the integration capability for the metadata cluster.   Its function would be divided into different packages to allow technology partners to connect into the open metadata and governance ecosystem.

Figure 2 shows the types of usage patterns for the open metadata and governance components.

Image Added

Figure 2: Open metadata and governance technology usage patterns

  • Data Platform Service - in this pattern, the open metadata components are embedded into a data platform and the engines that run the data platform are integrated with the open metadata APIs and notifications services to ensure that changes to data are recorded in the metadata repository and governance requirements are met.  This is the way Apache Atlas serves the Apache Hadoop platform today.  As an open metadata and governance capability, it would need to run on other platforms, such as cloud platforms.
  • Tool Repository - here the open metadata components are being used as the metadata repository for a data tool, such as a reporting package, discovery or data science tool.  The tool is calling the OMAS interfaces to save and query the metadata.
  • Adapter - here an open metadata repository service connector has been implemented to call an existing proprietary repository so its metadata is available to the broader ecosystem.  The existing tool interfaces and services are still available as before.
  • Enterprise View - in this pattern, a tool does not have its own metadata repository.  Instead it is using the Enterprise OMRS connector to save and query metadata for its use.  It will then use the metadata repositories that are connected into the metadata repository cluster.
  • Metadata Highway - where there are multiple tools that are using the enterprise view and adapter patterns, the open metadata components that manage the distribution of metadata across the cluster, plus an open metadata repository implementation is required to support the metadata repository cluster.  This would include the OMAS interfaces, OMRS notification services and the metadata repository.

 

 

 

...