This page will capture design ideas for creating an Apache NiFi registry for extensions and templates, allowing them to be easily pulled into a NiFi instance, and thus reducing the amount of bundles distributed with an Apache NiFi release.
Today NiFi provides a mechanism to introduce new components (e.g., NAR) and share flow templates (pre-configured flows). For components (Processors, ControllerServices etc.) the current mechanism assumes packages (NARs) containing such components exist in some predefined location on local file system as they are typically distributed with NiFi distribution. For templates there is a UI-based import/export functionality giving user a little more control as to what is available to NiFi.
While fairly simple and well understood, current mechanism does expose certain challenges:
By providing a configurable and centralized extension registry similar to the one used by other products that expose plug-and-play extension model and thus heavily rely on community participation (Grails plug-ins), most (if not all) challenges described above could be addressed, facilitating even greater NiFi adoption.
Below is the list of features that we may want to consider
This section will describe how some other projects are tackling the same type of problem.
Quoting their main page "Bintray gives developers full control over how they store, publish, download, promote and distribute software with advanced features that fully automate the software distribution process"
Some of the key features;
NOTE: Hereafter, we use the term "extension" to mean all types of extension including both NAR packages and Templates, as well as future types not yet defined. (We need some generic term, and it didn't seem necessary to limit "extension" to just NAR packages.)
We are guided by the following use cases:
1. To keep NiFi small and lightweight, both in distribution and deployment, we want to separate the 40 current NARs from the core distro and support dynamically loading them, both from a local filesystem repository and a remote web-based repository. Appropriate security support must be part of the implementation.
2. A User, using the NiFi GUI to construct a new Workflow, needs the ability to DISCOVER new Processors and other extensions they can use. This suggests interactions that allow them to point NiFi at known-safe repositories, and then browse or otherwise discover the Extensions available therein (without loading all those extensions, some of which are quite large). This implies the need to obtain from the repo, manage, and present, METADATA and descriptive info about the extensions separately from the extensions themselves.
3. The community desires to share Templates. Templates are XML files with dependencies on Processors and other resources they may specify. The extension and repository model for NiFi should support sharing Templates and their dependencies in a natural and secure way.
4. One of us is looking at adding an Apache Camel Processor pair to NiFi (NIFI-1842). Camel supports connecting to over 200 end-point types not yet supported natively in NiFi, and a Camel Processor could give us access to many of them with minimal additional work. Each of those end-points is already packaged as a dynamically-loadable extension to Camel. The packaging is as a JAR. We desire to enable NiFi to manage Camel extensions as NiFi plug-ins, without having to repackage each as a NAR, and without needing separate non-Camel repositories for them. In the future, other new NiFi extensions may also have need for custom sub-extensions of their own.
We would like the set of extension types to be extensible, itself. Each extension type needs its own integration with core NiFi. Besides loading classes and other resources, those resources must be registered with different parts of the GUI and other components of NiFi. The extensibility mechanism should recognize and serve this need as well as possible. To reduce the complexity of this project, we propose to NOT include extensibility of the set of extension types, nor repository types. Extending these will require core code changes, for now.
The two principal abstractions are ExtensionSpec and
An ExtensionRepository instance represents a repository serving extension packages, and provides APIs for NiFi to access such repositories in a standard way. It is recommended that there always be at least one trusted ExtensionRepository, designated the System Repository, installed as part of the NiFi deployment, and serving the set of standard NiFi extensions published as part of NiFi releases. ExtensionRepository implementations provide the following functionality:
ExtensionRepositories don't know anything about the internals of the extension package, or the packaging itself. Their responsibility ends after delivering, to an agreed location in the local filesystem, the concrete package file that can be loaded by some other mechanism appropriate to the extension type.
Our initial implementation will support as sub-classes:
In this implementation, the set of supported repository types (ExtensionRepository sub-classes) can only be extended via core code changes. We choose not to make ExtensionRepository an extension type, to assure community review of highly security-sensitive code, and to control the scope of this project.
An ExtensionSpec instance is an envelope holding the metadata needed to support identification, discovery, delivery, and loading of a single extension package and its dependencies. This includes such metadata as extension name, extension type, packaging type, version, repository containing the extension, address or locator information within that repository, human-readable description, and dependency list. If more than a couple lines of descriptive text are required to assist human-interactive discovery, the metadata may include a pointer to a supplementary documentation file (see sub-section 'B' below). When an ExtensionRepository presents metadata about an extension, it constructs an ExtensionSpec instance. When the GUI presents extension info to the user prior to loading that extension, it obtains that info from an ExtensionSpec instance. The ExtensionSpec class is sub-classed at need, when different sets of metadata are needed for the purposes of different kinds of extension. However, many different kinds of extension may share a common ExtensionSpec subclass, being distinguished by the extension type and packaging type member fields. (In other words, don't expect a different ExtensionSpec subclass for every different kind of extension, as we found no need for this.)
Our initial implementation will support:
In this implementation, the set of supported extension types and ExtensionSpec sub-classes can only be extended via core code changes. Furthermore, we choose to leave the extension loader code (load and registration process) for each extension type in core code, rather than try to incorporate this highly extension-type-specific code into this project. This is partly to reduce the complexity of the implementation by not having to refactor lots of existing loader code, and partly because we didn't see the need since adding new extension types will require core code changes anyway.
For both the filesystem-based repository and the maven repository, we propose to require that each extension be distributed as a set of 6 files:
These requirements mirror the requirements for a normal Maven repository deployment, and are not unreasonably burdensome for a filesystem-based repository deployment. Other repository types may share the same means of managing the metadata and security signature requirements, or they may choose to manage this info in other ways.
Some types of extension may need more than a brief textual description; for instance, a user exploring Templates would benefit from a rendered flow diagram for each Template. Such info is not appropriate in a POM file, so in addition to the above, we propose, optionally and only when needed:
To dynamically load an extension, it is not sufficient to simply ingest the code or content of the extension into the JVM. It is also necessary that the clients of the code or content become "aware" that the new material is available. This may be done in a variety of ways, depending on the structure of the client code, and may be referred to as registration, triggering, alerting, or re-scanning. Today's NAR loader supports registration lists, via the ExtensionMapping class, for processors, controller services, and reporting tasks. We haven't yet determined if the NAR loader provides for the registration needs of any other extension types. As noted above, we have declined to refactor existing loading code as part of the scope of this project, so loading and registration will remain an extension-type-specific part of core code.