You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 6 Next »

1. Purpose

This page will capture design ideas for creating an Apache NiFi registry for extensions and templates, allowing them to be easily pulled into a NiFi instance, and thus reducing the amount of bundles distributed with an Apache NiFi release.

Today in NiFi the build can become quite large housing a series of extension sets that may or may not be of use to a given flow.  And the management and sharing of templates is quite manual.  Both of these create deployment challenges and reduce sharing and maintenance of best practices.

By providing a central registry for extensions a given dataflow can be configured to pull a specific extension bundle version.   The listing of processors available to a user could be based on what is in the local classpath as is the case today but also populate from what is found in the extension registry.  As the user selects something from the registry the extension set can be automatically pulled in and after a nifi restart available for use.  Further, the user can be automatically notified of an available update for those extensions.  In addition it means that a given nifi instance will only ever have to hold extension sets which are necessary for the function of that nifi instance.

By providing a central registry for templates the ability to share and comment on best practice based dataflows would great increase.  NiFi itself can be setup to nicely integrate by pushing templates to this service with comments/versioning and also to search for existing templates and easily place them into the graph.

There is also value in allowing templates to be more readily CM’d using typical CM/source control tools.  If templates are serialized as the text-based XML that they are today and with deterministic behavior this should be no problem.  We must not store running/stopped type state in the template, we must ensure reliable ordering, and we must establish deterministic identifiers within the template that are not tied to the identifier scheme of the system on which the template was made.  Today by violating each of these it makes doing traditional CM on those templates impossible.

The central registry will need to take into account the varying levels of authorization that might be needed for access to a given extension set or template.  This way an organization can perhaps allow others to know extensions or templates exist but not let them use them without sharing approval/licensing/etc..  They could restrict access to even know about them, and more.

2. Requirements

  • Ability to upload a NAR or template
  • Ability to search for a NAR or template
  • Ability to add a NAR or template from the registry to a NiFi instance

3. Registry Examples

This section will describe how some other projects are tackling the same type of problem.

Spark-Packages

  • Requires authentication with a GitHub account.
  • Registering a package requires pointing to a public GitHub repo owned by the given user, with a LICENSE and README. 
  • A name and description are also provided during registration.
  • There is a command line tool to help start new packages and publish them.
  • The web site lets a user search for packages and view details of each package.
  • The details of a package show how to use the given package with various tools such as the spark-shell, sbt, and/or Maven, see example
  • For Maven it shows a dependency and repository snippet that could be added to a pom, and there seems to be a repository.
  • Packages can also be voted on and tagged.

4. NiFi Registry Design

 

5. Implementation Considerations

  • A NAR will need to be able to specify the minimum and maximum versions of the nifi-api that the NAR can work against
  • If a NAR has a "Parent NAR", it will need to be able to specify the minimum and maximum versions of the Parent NAR that it can work against
  • NARs and Templates will be uploaded from non-Apache sources, so they should provide some mechanism of indicating the vendor name/contact info, as well as the URL for an Issue Tracking System
  • No labels