Develop a mechanism for users to view, create and modify variable/value pairs within NiFi.
Ensure those variables can be referenced by components within a processor group and secured within the context of that group
Background and strategic fit
NiFi flow configuration allows users to refer to environment and system variables as well as flow file attributes, and custom properties using expression language (EL) in certain component properties. This provides users some flexibility to dynamically populate property values, which not only simplifies configuration through value reuse, but also improves portability of templates to new environments. However the management of these variables is challenging especially in the case of custom properties since they cannot be updated at NiFi runtime and require additional coordination for admins in a clustered configuration to ensure custom properties files exist for each node. Security may also be an issue when using custom properties files since they may contain sensitive values, such as passwords, in clear text.
Also the UI does not provide a means for users to see variables available for configuration. Knowledge of available variables would need to be obtained from reviewing server configurations or from a NiFi administrator. Users are also unable to restrict access to any variables (all are considered globally available).
With this mechanism we would look to improve the creation, management and security of custom variables and their use within NiFi components and templates. This will aid in simplifying the use and configuration of flows with variables and help support the value proposition of templates, by improving their portability and supporting the move towards configuration management.
A Variable Registry could be a component that stores variable information, including description of the variable (type, sensitive, etc) and value. A registry can provide access to variables that may have one of two scopes which we could define as node and cluster. Node scoped variables are key/value pairs that will be available only on a specific host/node. These variables can include existing environment (OS) variables, java system properties as well as file-based custom variables. Initially node scoped variables would be captured at startup of NiFi yet treated as static and immutable during runtime (only visible in the UI). However this can expand to allow variables with a node scope to be configurable at runtime (except for those that are set on the OS environment or java system level). Cluster scoped variables are key/value pairs which will be available throughout the cluster. These variables can be created and modified at runtime.
A single registry instance could be an attribute of a process group (for processors) and controller (for reporting tasks, controller services). Being an entity at these levels could help support communication of variable registry configuration within a cluster (via flow.xml) and also allow the registry to leverage multi-tenancy functionality for access (see Authorization). Each process group would contain a registry which can override a parent process group’s registry; therefore a variable referred by a component can be resolved by searching within the registry of that component’s process group. If needed the search could continue up the hierarchy to the root process group’s variable registry to discover referenced variables.
The variable registry at the root level process group could be preloaded with system variables such that they can maintain their global availability while custom variables can be configured within the root group or a child/descendant process group.
The ability for users to view/modify registries would be be secured via policy. To be consistent with multi-tenancy perspective for access control, access policy for the variable registry can be secured at the process group level, where the system could require that users have the ability to modify a process group as well as modify a variable registry in order make registry changes.
Users should consider the implications of using variables of a sensitive nature within certain processors especially those with Restricted Access or logging capabilities
Components could obtain a snapshot of variables within it’s scope during the OnScheduled step from the registry within its process group. Also if variables are updated during flow execution, in order to ensure that components within a process group receive the most recent version of variables, the components which have a reference to a “dirty” variable would need to be restarted (similar to controller service updates and how connected components are restarted). This will ensure minimal downtime within the flow yet ensuring that the latest version of variables are available.
Variable registry instances would be exported as entities in the flow schema according to their hierarchical relationships (to the controller/process groups). Only custom variables should be included in the actual export and include values of non-sensitive variables (sensitive variables will not have values available). On import of a template the system could notify users that variables are being imported and provide access to review/edit values if the user is authorized to access the registry. If the system already has variables of the same name with populated values it could prompt the user to either change existing value or update with values included in the template.
Security of Properties
Values of sensitive variables would not be visible through the UI nor exportable in a template. These values would be encrypted in disk within the flow similarly to other sensitive properties.
An important assumption is that variables in variable registry would only be accessible through expression language initially. Although use cases exist to potentially support other attributes of processors, such as concurrent tasks, this could be considered as out of scope for this initial effort.
|1||Runtime update of custom property values||As a user I want to ensure I have the latest variable value available for my flow with limited downtime, so that I can ensure that I'm processing with up to date information||NIFI-2767|
User interaction and design
Users will need a place to access the variable registry in NiFi. The main access point will likely be available as an option from the current global menu accessible via an icon in the top right corner of the application. The registry should also be accessible from other areas of the application, such as from a process group's context menu or configuration dialog, where the registry would present a subset of variables scoped to that specific process group.
From within the variable registry UI, users should be able to manage variables that are being used in the NiFi instance or across instances if in a cluster. The ability to perform certain actions will depend on any access policies that may have been set. Users should be able to:
- view a list of all variables
- view the components (by name or ID depending on access policies) that reference each variable
- view variable metadata like variable name, variable type, variable value, a description of the variable, created on/by, and last updated on/by
- filter the view to only show variables available or currently referenced by a particular component or components
- filter the view by other relevant parameters (TBD)
- create new variables, apply scope from which the variable will be available, and provide associated metadata
- import other variables from a property file
- edit and delete existing variables
It will be especially important to effectively communicate with the user when importing, editing, and deleting variables. For example, consider these scenarios: (1.) a variable is imported that has the same name as an existing variable, or (2.) a variable's value referenced by multiple components is changed. NiFi should be able to check for these conditions, present a meaningful summary, and provide guidance and/or follow-on action to resolve the conflict. Additionally, NiFi should provide relevant actions (e.g., restart) for a user when modifying variables that affect components that are currently scheduled to run.
Current configuration dialogs will need to accomodate new functionality for users to see and choose available variables. Depending on a user's experience level and knowledge of the data flow, a tiered experience for variable selection would be ideal. For example, a clearly labeled, clickable element to see a list of and choose existing variables would help a less experienced user whereas the ability to simply type into a form field with autocomplete assistance would help an experienced user work more efficiently. A way to quickly edit variables from configuration dialogs will likely be desired as well. This way, a user would not need to interrupt their workflow by accessing the full variable registry UI to make only a single change.
Below is a list of questions to be addressed as a result of this requirements document: