1. Purpose

This page will capture design ideas for creating an Apache NiFi registry for extensions and templates, allowing them to be easily pulled into a NiFi instance, and thus reducing the amount of bundles distributed with an Apache NiFi release.

2. Problem

Today NiFi provides a mechanism to introduce new components (e.g., NAR) and share flow templates (pre-configured flows). For components (Processors, ControllerServices etc.) the current mechanism assumes packages (NARs) containing such components exist in some predefined location on local file system as they are typically distributed with NiFi distribution. For templates there is a UI-based import/export functionality giving user a little more control as to what is available to NiFi.

While fairly simple and well understood, current mechanism does expose certain challenges:

  1. Given that NiFi comes distributed with all supported components its distribution size became quite large housing a series of extension sets that may or may not be of use by a given flow. As more components are introduced, the current mechanism will become unsustainable.
  2. Regardless if there is an intent to use, every component is loaded into the JVM (classes are loaded to produce documentation and multiple instance are created due to the pre-existing bug NIFI-1318 - Getting issue details... STATUS ). 
  3. Management and sharing of artifacts (NARs, templates etc) is a manual process
    1. On top of that exported templates contain flow/component state (running/stopped), some identifiers are tied to the identifier scheme of the system on which the template was made which may present security risk.
  4. No versioning support for components and/or templates
  5. No ability to introduce new components once NiFi is started. Components must exist in NiFi prior to NiFi startup and loaded into NiFi (see #2).
  6. Third-party vendors are locked out of participation due to legal constraints of including a component with non-ASF compatible licenses or including components that may depend on other components with non-ASF compatible licenses.
  7. Repetitive library distribution. While each component is designed to perform a very specific task, most of them still depend on common libraries  (e.g., log4j, spring, etc.). Such libraries are included as part of individual NARs creating a possibility to where several NARs in NiFi distribution contain the same library.  

By providing a configurable and centralized extension registry similar to the one used by other products that expose plug-and-play extension model and thus heavily rely on community participation (Grails plug-ins), most (if not all) challenges described above could be addressed, facilitating even greater NiFi adoption.

3. Proposed Extension Registry Features

Below is the list of features that we may want to consider 

  1. Publish NAR/Template. 
  2. Version, presumably following a well known and recognized convention (e.g., Maven - artifactId:groupId:version).
    1. This will expose ability to have multiple versions of the same component available to the user.
    2. Additionally, NAR/Template may be tied to a specific version (or range of versions) of NiFi
  3. Access components documentation regardless if such components are available locally.
  4. Pull NAR/Template. This essentially implies bringing NAR/Template from the remote location to a running instance of NiFi. However, I don't believe this implies that user has to know about local/remote availability of the component. In fact in best case scenario user experience must not change. For example, user would have a familiar browser window for Processors as they do today, but such window would be populated with what is available to the user from the extension registry. If a particular component is already available locally then that is where it will be loaded from and if not then NiFi should transparently pull such component, essentially caching NAR/Templates locally (similar to local Maven cache). 
  5. Browse and/or search for NARs or templates within and outside of NiFi UI. This implies that the extension registry must have the ability to be accessed with the conventional web browser.
  6. Configure location of extension registry. Similar to Maven, user's must be able to configure the location of the extension registry.

4. Registry Examples

This section will describe how some other projects are tackling the same type of problem.

JFrog Bintray 

Quoting their main page "Bintray gives developers full control over how they store, publish, download, promote and distribute software with advanced features that fully automate the software distribution process"

 Some of the key features;

  • Free for hosting OS artifacts
  • Multiple repository/packaging styles including Maven, Docker RPM etc., essentially supporting versioning feature described in "Proposed Features"
  • Browsing and Search capabilities described in "Proposed Features"
  • Documentation features described in "Proposed Features"
  • REST API to integrated with running NIFi instance to expose Browse, Search and provide access to documentation.  
  • Automated plug-ins (Maven, Gtradle etc) for automated publishing of the artifacts

Spark-Packages

  • Requires authentication with a GitHub account.
  • Registering a package requires pointing to a public GitHub repo owned by the given user, with a LICENSE and README. 
  • A name and description are also provided during registration.
  • There is a command line tool to help start new packages and publish them.
  • The web site lets a user search for packages and view details of each package.
  • The details of a package show how to use the given package with various tools such as the spark-shell, sbt, and/or Maven, see example
  • For Maven it shows a dependency and repository snippet that could be added to a pom, and there seems to be a repository.
  • Packages can also be voted on and tagged.

5. Use Cases

NOTE:  Hereafter, we use the term "extension" to mean all types of extension including both NAR packages and Templates, as well as future types not yet defined.  (We need some generic term, and it didn't seem necessary to limit "extension" to just NAR packages.)

We are guided by the following use cases:

1. To keep NiFi small and lightweight, both in distribution and deployment, we want to separate the 40 current NARs from the core distro and support dynamically loading them, both from a local filesystem repository and a remote web-based repository.  Appropriate security support must be part of the implementation.

2. A User, using the NiFi GUI to construct a new Workflow, needs the ability to DISCOVER new Processors and other extensions they can use.  This suggests interactions that allow them to point NiFi at known-safe repositories, and then browse or otherwise discover the Extensions available therein (without loading all those extensions, some of which are quite large).  This implies the need to obtain from the repo, manage, and present, METADATA and descriptive info about the extensions separately from the extensions themselves.

3. The community desires to share Templates.  Templates are XML files with dependencies on Processors and other resources they may specify.  The extension and repository model for NiFi should support sharing Templates and their dependencies in a natural and secure way.

4. One of us is looking at adding an Apache Camel Processor pair to NiFi (NIFI-1842).  Camel supports connecting to over 200 end-point types not yet supported natively in NiFi, and a Camel Processor could give us access to many of them with minimal additional work.  Each of those end-points is already packaged as a dynamically-loadable extension to Camel.  The packaging is as a JAR. We desire to enable NiFi to manage Camel extensions as NiFi plug-ins, without having to repackage each as a NAR, and without needing separate non-Camel repositories for them.  In the future, other new NiFi extensions may also have need for custom sub-extensions of their own.

5. We would like the set of extension types to be extensible, itself.  Each extension type needs its own integration with core NiFi.  Besides loading classes and other resources, those resources must be registered with different parts of the GUI and other components of NiFi.  The extensibility mechanism should recognize and serve this need as well as possible.  To reduce the complexity of this project, we propose to NOT include extensibility of the set of extension types, nor repository types.  Extending these will require core code changes, for now.

6. Design Proposal

A. Principal Abstractions

The two principal abstractions are ExtensionSpec and ExternalRepository ExtensionRepository.

An ExtensionRepository instance represents a repository serving extension packages, and provides APIs for NiFi to access such repositories in a standard way.  It is recommended that there always be at least one trusted ExtensionRepository, designated the System Repository, installed as part of the NiFi deployment, and serving the set of standard NiFi extensions published as part of NiFi releases.  ExtensionRepository implementations provide the following functionality:

  • Present metadata about some or all of the extension packages in the repo, to facilitate human-interactive discovery.
  • Manage information about the security signature of extension packages, and only deliver packages that are properly signed, by a signing authority recognized and approved at the system level. 
  • Manage information about the dependencies of the extension, which are declared in metadata, and only deliver packages whose dependencies are either:
    • already available in the system, or System Repository,
    • or are available from the same repository instance.
    • (For dependencies which are not available in the system nor in the same repository, delivery of the specified package will fail until the dependencies are resolved by User action. This avoids recursive calls to repository code, and issues of cross-access between repositories without human approval. Note extensions in the System Repository should not have outside dependencies; the System Repo should have closure.)
  • Deliver individual extension package files, and where possible their dependencies, on demand.

ExtensionRepositories don't know anything about the internals of the extension package, or the packaging itself. Their responsibility ends after delivering, to an agreed location in the local filesystem, the concrete package file that can be loaded by some other mechanism appropriate to the extension type.  

Our initial implementation will support as sub-classes:

  • Filesystem directory-based repositories mounted locally on the NiFi server
  • Maven repositories enabled in the NiFi server's maven configuration.

In this implementation, the set of supported repository types (ExtensionRepository sub-classes) can only be extended via core code changes.  We choose not to make ExtensionRepository an extension type, to assure community review of highly security-sensitive code, and to control the scope of this project.

An ExtensionSpec instance is an envelope holding the metadata needed to support identification, discovery, delivery, and loading of a single extension package and its dependencies. This includes such metadata as extension name, extension type, packaging type, version, repository containing the extension, address or locator information within that repository, human-readable description, and dependency list.  If more than a couple lines of descriptive text are required to assist human-interactive discovery, the metadata may include a pointer to a supplementary documentation file (see sub-section 'B' below).  When an ExtensionRepository presents metadata about an extension, it constructs an ExtensionSpec instance.  When the GUI presents extension info to the user prior to loading that extension, it obtains that info from an ExtensionSpec instance.  The ExtensionSpec class is sub-classed at need, when different sets of metadata are needed for the purposes of different kinds of extension.  However, many different kinds of extension may share a common ExtensionSpec subclass, being distinguished by the extension type and packaging type member fields.  (In other words, don't expect a different ExtensionSpec subclass for every different kind of extension, as we found no need for this.)

Our initial implementation will support:

  • All extension types already supported by the NAR sideloader, specifically including Processor, Controller Service, and Reporting Task extensions, all packaged as NARs
  • Templates packaged as XMLs

In this implementation, the set of supported extension types and ExtensionSpec sub-classes can only be extended via core code changes.  Furthermore, we choose to leave the extension loader code (load and registration process) for each extension type in core code, rather than try to incorporate this highly extension-type-specific code into this project.  This is partly to reduce the complexity of the implementation by not having to refactor lots of existing loader code, and partly because we didn't see the need since adding new extension types will require core code changes anyway.

B. Metadata, Descriptive Documentation, and Security Signatures

For both the filesystem-based repository and the maven repository, we propose to require that each extension be distributed as a set of 6 files:

  • a single-file package containing the extension itself, as a NAR or other file format appropriate to the extension type
  • a maven POM file for the extension, containing certain mandatory XML elements, from which the extension's metadata and dependencies are obtained
  • MD5 and SHA1 signature files for each of the package file and POM file.

These requirements mirror the requirements for a normal Maven repository deployment, and are not unreasonably burdensome for a filesystem-based repository deployment.  Other repository types may share the same means of managing the metadata and security signature requirements, or they may choose to manage this info in other ways.

Some types of extension may need more than a brief textual description; for instance, a user exploring Templates would benefit from a rendered flow diagram for each Template.  Such info is not appropriate in a POM file, so in addition to the above, we propose, optionally and only when needed:

  • a documentation file (in adoc format), including the signature of the described extension
  • and MD5 and SHA1 signature files for the adoc file

C. Note about Registration as part of Loading

To dynamically load an extension, it is not sufficient to simply ingest the code or content of the extension into the JVM. It is also necessary that the clients of the code or content become "aware" that the new material is available. This may be done in a variety of ways, depending on the structure of the client code, and may be referred to as registration, triggering, alerting, or re-scanning.  Today's NAR loader supports registration lists, via the ExtensionMapping class, for processors, controller services, and reporting tasks.  We haven't yet determined if the NAR loader provides for the registration needs of any other extension types.  As noted above, we have declined to refactor existing loading code as part of the scope of this project, so loading and registration will remain an extension-type-specific part of core code.

 

  • No labels

24 Comments

  1. Mark,

        I think you've captured the core features i'd like to see.  Under Implementations considerations I'd like to see something like "A Template should specify a schema with a version that indicates the version of the flow/template model it is compatible with"  I know the flow model doesn't change much, but in the event a new featureis added, you might want to prevent that flow from being loaded in an older version of nifi.

     

        I was thinking that we should be able to include templates in nars.  Since that is how users plug extensions into NiFi.  I understand that templates aren't really extensions, but I wonder if the registry would be made more simple if it only had to "register" one type of thing(a nar).  Thoughts?

  2. Dan,

    I could easily see a template leveraging components in multiple nars.  It could be cool to let someone embed templates within a nar though as it would be a cool way to make it for a user to get tips on how to compose that processor with others.

     

    Thanks

    Joe

    1. Joe,

         Good point.  I didn't think about the multiple nar thing, but thats a great point.

       

         Since you bring that up, should a template reflect what nars it needs?  Currently if you try and instantiate a template with processors that are not loaded in your NiFi instance it tells you that it cannot find certain processors.  Might be helpful to say "you need ___, ___, ___ nars"  Not sure whose responsibility that is.  The apps' or the registry's.

       

      Dan

      1. I do believe that if we get this right when a template is imported all necessary nars would be pulled in automatically as well.

  3. If we have a cluster of NiFi Servers, how can we push the artifact (_nar_) to other instances ? Even the same challenge exists with current Spring Integration implementation unless we go to every box and copy dependencies?

    Please put your thoughts to brainstorm.

  4. The node on the cluster is no different then any other node in that same cluster or no cluster and therefore will use the same mechanism (whatever that mechanism will be) to pull the extensions as all other nodes.

    Keep in mind; Extension Registry proposes an enhancement to the core design of the deployment model for the NiFi extensions (i.e., Processors, ControllerServices, Reporting Tasks, templates etc.), decoupling them from the NiFi core.
    Primary purpose is to address the NiFi footprint that is expected to grow, versioning and other things that are outlined above.

    We are in the early stages of these discussions and implementation details still need to be discussed and prototyped, so not sure about the Spring Integration comment here.

    1. Agreed, we are at very early stage of this discussion, still my intent here was to address existing issues that can be  linked together and addressed in a same manner. 

      1. Yep.  Good idea Puspendu.

        1. Based on the

          1. number of files those are being affected,
          2. to avoid flooding mainline PR space,
          3. allowing everyone to contribute before it's too late and 
          4. to avoid any well-amounted throwaway effort 

          it will be better to have a separate branch with parallel to 1.0.0 branch.

          Another thing is that I have prepared a file based Reference Implementation for registry and wanted to get the same reviewed. So if you create a branch & a ticket, I can check in there, else shall create a PR.

          for the time being the mentioned RI is available at: https://github.com/PuspenduBanerjee/nifi/tree/NIFI-SimpleExtRegistry

          To test: 

          1. create/download a custom processor nifi-blahbase-nar-1.0-SNAPSHOT.nar at your home directory at <PATH_TO_NAR>.
          2. run nifi 
          3. access the design page at http://<server>:<port>/nifi
          4. drag a processor and search for type 'MyProcessor' – you should find nothing
          5. open another tab in browser http://<server>:<port>/nifi-api/controller/sideload?narURI=<PATH_TO_NAR>
          6. Now, redo step #3 & #4 you should be able to create an instance of "MyProcessor"

           

  5. Joe WittMatthew Foley or anyone got a chance to look at the extension registry implementation yet?

     

  6. Added Maven based Artifact(nar) Resolver support. So technically one can pass GAV of a nar to nifi-api/controller/sideload?groupId=aa&artifactId=bb&version=cc to load a nar from maven repo.

  7. Hi Puspendu Banerjee, I'm new to NiFi, but I'll share the thoughts I have, and hope they're of use.
    First, thank you for doing a concrete implementation! It's wonderful how much more concrete one's own thoughts get when reviewing it :-)
    I have focused on the diff at https://github.com/apache/nifi/compare/master...PuspenduBanerjee:NIFI-ExtRegistry
    If that caused me to miss stuff, my apologies.

    Comment 1.
    We will have multiple repositories of extensions, published by multiple different owners. That being the case, the GAV model of ExtensionSpec as you propose, works very well, where the Group probably corresponds with the repository owner via a structured name. Rather than focus on Nar extensions, would it make sense to provide immediately for all kinds of extensions, so the "packaging" parameter of the spec (maybe "extensionType" instead?) could be nar, template, or any other sort of resource we want to make dynamically loadable? In other words, let's specify the super-class ExtensionSpec.

    Comment 2.
    It seems to me that the concept of a repository needs to be more fleshed out as a first-class entity, because we need to be able to do the following operations on one:

    1. register a trusted repository as okay for this NiFi site to draw from (via top-level URI, or perhaps implied from Group structured name)
    2. load hierarchy of extensionTypes and artifactIds available from a repository (just identifiers, without taking up a lot of memory or time)
    3. dynamically load only documentation of an artifact or group of artifacts, from a repository, for presentation in the GUI (so user can see whether a new extension is desirable) - related to NIFI-1601
    4. dynamically fully load ("resolve") an artifact or group of artifacts, from a repository
    5. for non-local repositories, resolving probably also involves copy to local cache, unless the resolve mechanism (such as Maven) already does so

    Now considering method #4 above, the loader/resolver, this clearly needs to be different for different extensionTypes. It isn't clear whether it needs to be different for different repository types, or whether different repository accessors can be layered over a single stream processor for the particular extensionType loader. I think that should work, because we probably want to reduce all artifact loads into "first copy from repo to local file, then load the file".

    Your two example ExtensionSpec's are different for file repository vs maven repository, i.e., sub-classed on repository type. Considering the above, I would suggest instead having:

    • RepositoryResolver, sub-classed for different repository types, that can do the five operations listed above
    • ExtensionSpec, sub-classed for different extensionTypes, and including repository-independent methods for #3 and #4, called after the RepositoryResolver obtains the stream in a repository-type-specific manner.

    If you agree with this, we can edit the proposal to include first-class Registry entities with the above actions.

    Comment 3.
    I am inclined to specify that repository URIs and structured names should explicitly include the extensionType at a specific location immediately below the top level of a repository. Does that make sense? E.g., MyFooBar repository would have a top level URI and/or structured name, immediately below which would be nar, template, etc, sub-repositories. It may not be necessary to do this, but it would structure the world in an easily understood manner. It also lets the RepositoryResolver know what ExtensionSpec to invoke.

    Comment 4.
    As far as I can tell, this RI does not address removing the pre-load of all "built-in" resources. Maybe that's okay; it is after all much easier to remove excess built-in resources themselves (via a single .pom file edit), rather than remove the logic that assumes built-ins are pre-loaded. Is that the intent? This implies bug NIFI-1318 will be dealt with separately, but we'll still want to fix it, or at least assure it does not affect dynamically loaded resources.

    Please confirm, however, that resources in other repositories will not be pre-loaded. I am somewhat confused about ExtensionManager:discoverSideloadeExtensions(), which seems to load all things in a known repository rather than discover what's in it.

    That's all I have for now. Again, thanks for progressing this.

    1. Upon re-reading this, I see that I am using "nar" as synonymous with "Processor resource", which is de facto true now, but already proposed above to be extended.  So we probably need both extensionType and packagingType as attributes of ExtensionSpec.

    2. Hi Matthew Foley!
      Thanks for reviewing.
      Let me address your valuable comments/concerns:
      Comment 1
      Packaging can be overridden by using constructor parameter packaging, even though we don't have support for loading non-nar packaging readily available. Still it makes perfect sense to rename that class to ExtensionSpec instead of NarExtensionSpec.
      To support all other packaging likely there will be a lot of modification in core classses. So can we please park that for now?

      Comment 2
      It's pretty easily achievable via:
      1. register a trusted repository
      1.1. for maven compliant repositories with Shrinkwrap resolver configuration :
      a. Maven.configureResolver().fromFile("/path/to/settings.xml").resolve("G:A:V").withTransitivity().asFile();
      b. Maven.configureResolver().withRemoteRepo("my-repository-id", "url://to/my/repository", "layout").resolve("G:A:V").withTransitivity().asFile();
      We can implement any of the above immediately.

      1.2. for maven non-compliant repositories it will be pretty implementation specific. Let's leave that for now.

      2. load hierarchy of extensionTypes and artifactIds available from a repository (just identifiers, without taking up a lot of memory or time) -- It will not be very straight forward as Definitions lives inside a jar. Not sure how many artifacts we will be caching and how frequently we will be updating our cache.

      3. May be, we can try to parse a pom for a NAR/Artifact which will talk about the capabilities that it can offer and display that in UI. No doubt, that will be a very attractive feature.

      4. I preferred to offload the responsibility of resolving to ExtensionSpec because an EntensionSpec is highly cohesive with a particular type of repository. So, I would request you to think about it once more. And please feel comfortable to put a patch in if your time permits.

      5. Intentionally avoided that part for now, so that on restart NiFi will not find that Nar and delete working copy and I shall be able to load my one and only dummy extension :). For Maven Resolver, anyway it get stored in your local maven repository.

      Comment 3
      1. I am afraid, that case we will be dictating the repository layout. I reality we would have a little or no control on that.
      2. It does not conform maven repository layout.

      Comment 4
      1. No it does not have any capability to remove pre-loaded artifacts and if we ever become able to do that successfully that will be great. So, for the time being can we just use the de-facto overriding of classloader?

      ExtensionManager#discoverSideloadeExtensions() loads only the new artifact that one is trying to side-load. It uses NarClassLoaders.getSideLoadedExtensionClassLoaders() returns a filtered out collection of sideloaded NarClassLoaders using NarClassLoaders.MergeMapOperator<T, U>

      Please keep brainstorming and let me know if I have missed to address anything.

      And again, thanks a lot for reviewing and helping to get it moving because sometimes it becomes too late to get a reply or incorporate a change that all of the effort gets spoiled.

       

      of

      1. Puspendu Banerjee, thanks for your very fast response!  As you suggested, I'm working on some sample code to offer to you, but in the meantime:

        1. Park other forms of packaging - agreed.

        2.1 stick with just file and maven based example repositories for now - agreed.

        2.2 for file based repositories, we can do as yum does and require the file names to contain enough info that a directory listing provides the artifact Ids.  For maven this info is available through maven apis, isn't it?  Or could be, if it were included in the build.

        2.3 Being able to access specification/documentation without loading the artifact, seems to me is a required feature for non-built-in repositories to work.  This may require an enhancement of the nar format, as https://nifi.apache.org/docs/nifi-docs/html/developer-guide.html#nars does not seem to deal with the issue at all.  Alternatively, we can require that extension repositories provide parallel documentation repositories.

        2.4 "an ExtensionSpec is highly cohesive with a particular type of repository" - we have different opinions about this.  I think we would benefit from orthogonalizing extensions from repositories, and just say that a repository is a way to deliver a file resource, which is then consumed the same way any file resource is consumed.  There is no need to allow new repository types to define new resource encodings.  Let's keep control of that in NiFi.

        But I think perhaps I've made this sound more complicated than I intended.  It's actually intended to be a simplifying approach, by having only one encoding, in file form, for each resource type.  I'll get some sample code put together.

        2.5 agreed this goes away for the Maven repository, and is a non-issue for file repository.  Per 2.1, we can table this for now.

        3. I'm happy using Yum as guidance for file-based repositories and Maven as guidance for remote repositories.

        4. This is probably a misunderstanding on my part, so maybe we can talk about it offline.

        1. Matthew Foley ,Thanks for your passion to get it going.

          The only challenge with yum / rpm based repo is that more than one package can generate conflicting Shared Object [ read  Class File with Same FQDN] because they don't support the concept of Namespace. 

          The same can happen with maven based repo , but for any package at maven central or any other global maven repo goes through stringent process/check-points, which mitigates that risk.

          using m2 repo format we can be assured that if an user is not specifically modifying m2 settings (s)he should be good.

          Overall, it's good to see that we are coming towards a mutual consensus [ without help of paxos or zookeeper (smile) ]

          1. Ok, those are good reasons to question use of yum as an example file-based repo format, but I think we should support some sort of filesystem directory based repo that doesn't have to depend on maven.  And that directory structure has to support metadata such as artifact names and versions and signatures.  Yum seemed a convenient way to avoid reinventing the wheel.  Is there some other structured directory standard that would work?

  8. Matthew FoleyJoe Witt : I have created a sample registry repo [ maven layout] at https://github.com/PuspenduBanerjee/nifi-extension-registry .

    The intent there is to allow more nifi nars to get migrated there and get release via standard maven central repo, so that we can release a slim version of nifi itself.

    Now, as that project produces nars having groupId started with org.apache.nifi.extension , it will be more meaningful to migrate that underhttps://github.com/apache , so I need your help.

    So, need your guidance, vision, direction:

    1. Shall we keep it as a separate apache project?
    2. Shall we integrate it inside existing nifi tree?
    3. We don't need it right now..
    4. Any other idea that you may have.

     

    1. As you point out, we need an "official" Apache NiFi repository, where all the extension resources will have the same ownership by the NiFi PMC as the existing nars.  Therefore it may as well be integrated inside the existing nifi tree.  It will also present an example of how others can do additional repositories if they so choose.  If we also want to create a repository for 3rd-party resources not owned by the NiFi PMC, that's a different question.  I'm not sure that's needed.  There are issues of trust and security; as we discuss those, our position on 3rd-party repositories will probably clarify.

       

      1. I think, starting it under current nifi tree will reduce some paperwork but will have to wait for availability of binaries at maven central till a release is being published.

        Trust factor can be taken care of using signature verification e.get pgp , x509 which are already well established.

        1. Yes, we need signatures and signing certificates.  Are nar's currently signed?  I think, to be pluggable resources, they need to be signed separately from the NiFi release.

          And I agree we need both an official repository from the Apache NiFi project, which would wait on official releases from the NiFi PMC, but not necessarily waiting for a release of NiFi.  At the same time, unofficial repos based on file directories or Maven pushes to the local m2 repo with a private structured name (not starting with org.apache.nifi) would/should be supported and work for testing.

  9. Looking at Puspendu Banerjee's very useful reference implementation in context, particularly how it is implemented under "nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-nar-utils/src.main.java.org.apache.nifi/nar/ext/", and package "org.apache.nifi.nar.ext", made me understand a basic disconnect between the way I had been thinking about external resources, and the way this RI thinks of them.  I request input on this issue:

    To me, resources are things:  processors, services, templates, or other entities with a purpose.  These purposes relate strongly to why we want to exchange these resources with each other through external repositories.  The NAR, on the other hand, is a packaging for a pile of classes.  It is indeed a very useful packaging for pluggable resources, since it provides class-loader independence for each NAR.  But two questions:

    1. Are NARs the only kind of package we want to be able to use as external/pluggable resources, at least for now?  NARs are good packages for processors and services, but are they good packages for templates, or other kinds of useful resources that we already know we want to exchange?
    2. Besides packaging, what metadata do we expect a repository to provide about the resources it offers?  Is there value for users who use and manage external resources, in categorizing the resources as to what type of resource they are?

    It seems to me that packaging should not constrain our thinking about useful resources, and that we do need metadata about those resources, including their type. (Metadata clearly supports resource discovery, among other uses.) So I think the class structure should start with the various types of resource, and view NAR as the packaging of some types of resources; rather than start with the packaging (NAR) at the top of the class hierarchy.  But if there is established thinking in the NiFi community on this issue, I will conform to it.  Thanks.

  10. Puspendu Banerjee and I have been exchanging ideas about design, and experimenting with stub implementations.  I've added Use Cases and a Design Proposal above (sections 5 and 6), but it may be modified tomorrow.  We'll send an email to 'dev' when it is ready for review.

  11. The Use Cases and Design Proposal (above, in the main body of this wiki page) are ready for review, and the prototype code has been refactored to match (which simplified it a lot (smile) ).

    Please see https://github.com/PuspenduBanerjee/nifi/tree/NIFI-ExtRegistry for the prototype code, which is a solid beginning for a complete implementation.

    Both the design and implementation are joint work of Puspendu Banerjee and Matthew Foley.  Thanks in advance for review and comments.