Variable Registry

Target release
Epic
Document status	DRAFT
Document owner	Joe Witt, Yolanda M. Davis
Designer
Developers
QA

Goals

Develop a mechanism for users to view, create and modify variable/value pairs within NiFi.
Ensure those variables can be referenced by components within a processor group and secured within the context of that group

Background and strategic fit

NiFi flow configuration allows users to refer to environment and system variables as well as flow file attributes, and custom properties using expression language (EL) in certain component properties. This provides users some flexibility to dynamically populate property values, which not only simplifies configuration through value reuse, but also improves portability of templates to new environments. However the management of these variables is challenging especially in the case of custom properties since they cannot be updated at NiFi runtime and require additional coordination for admins in a clustered configuration to ensure custom properties files exist for each node. Security may also be an issue when using custom properties files since they may contain sensitive values, such as passwords, in clear text.

Also the UI does not provide a means for users to see variables available for configuration. Knowledge of available variables would need to be obtained from reviewing server configurations or from a NiFi administrator. Users are also unable to restrict access to any variables (all are considered globally available).

With this mechanism we would look to improve the creation, management and security of custom variables and their use within NiFi components and templates. This will aid in simplifying the use and configuration of flows with variables and help support the value proposition of templates, by improving their portability and supporting the move towards configuration management.

Entity-Relationship Design

A Variable Registry could be a component that stores variable information, including description of the variable (type, sensitive, etc) and value. A registry can provide access to variables that may have one of two scopes which we could define as node and cluster. Node scoped variables are key/value pairs that will be available only on a specific host/node. These variables can include existing environment (OS) variables, java system properties as well as file-based custom variables. Initially node scoped variables would be captured at startup of NiFi yet treated as static and immutable during runtime (only visible in the UI). However this can expand to allow variables with a node scope to be configurable at runtime (except for those that are set on the OS environment or java system level). Cluster scoped variables are key/value pairs which will be available throughout the cluster. These variables can be created and modified at runtime.

A single registry instance could be an attribute of a process group (for processors) and controller (for reporting tasks, controller services). Being an entity at these levels could help support communication of variable registry configuration within a cluster (via flow.xml) but also allow the registry to leverage multi tenancy functionality for access (see Authorization). Each process group would contain a registry which can override a parent process group’s registry; therefore a variable referred by a component can be resolved by searching within the registry of that component’s process group. If needed the search could continue up the hierarchy to the root process group’s variable registry to discover referenced variables.
The variable registry at the root level process group could be preloaded with system variables such that they can maintain their global availability while custom variables can be configured within the root group or a child/descendant process group.

Authorization

The ability for users to view/modify registries would be be secured via policy. To be consistent with multi-tenancy perspective for access control, access policy for the variable registry can be secured at the process group level, where the system could require that users have the ability to modify a process group as well as modify a variable registry in order make registry changes.

Users should consider the implications of using variables of a sensitive nature within certain processors especially those with Restricted Access or logging capabilities

Lifecycle Management

Components could obtain a snapshot of variables within it’s scope during the OnScheduled step from the registry within its process group. Also if variables are updated during flow execution, in order to ensure that components within a process group receive the most recent version of variables, the components which have a reference to a “dirty” variable would need to be restarted (similar to controller service updates and how connected components are restarted). This will ensure minimal downtime within the flow yet ensuring that the latest version of variables are available.

Template Management

Variable registry instances would be exported as entities in the flow schema according to their hierarchical relationships (to the controller/process groups). Only custom variables should be included in the actual export and include values of non-sensitive variables (sensitive variables will not have values available). On import of a template the system could notify users that variables are being imported and provide access to review/edit values if the user is authorized to access the registry. If the system already has variables of the same name with populated values it could prompt the user to either change existing value or update with values included in the template.

Security of Properties

Values of sensitive variables would not be visible through the UI nor exportable in a template. These values would be encrypted in disk within the flow similarly to other sensitive properties.

Assumptions

An important assumption is that variables in variable registry would only be accessible through expression language initially. Although use cases exist to potentially support other attributes of processors, such as concurrent tasks, this could be considered as out of scope for this initial effort.

Requirements

#	Title	User Story	Importance	Notes
1	Runtime update of custom property values	As a user I want to ensure I have the latest variable value available for my flow with limited downtime, so that I can ensure that I'm processing with up to date information	NIFI-2767	This JIRA focused specifically on periodically reload of file based properties. However there are some considerations, which were noted on this JIRA that were discussed which may influence design
2

User interaction and design

For users who have authorization, registries can be viewed/modified by selecting a process group (root or child level) or the controller level and viewing it’s configuration. A tab for the registry would be available and upon selection the system could list variables with their scope (node or cluster) and their values. Initially node scoped variables could be read only however cluster scope variables would be editable. Future iterations could allow node scoped variables currently managed in properties files to be created here as well. To create a variable users could provide key/value information, choose scope (initially cluster), indicate if it is sensitive, and provide description of the variable.

The ability to restart running processors when changes are made could also be available from the variable registry view as well (see Lifecycle Management). A convenience feature could also be included to support uploading property files via the UI to import new variables. This can provide a migration option for the existing custom property file support that exists today.

Variables would need to be referenced within component properties that support expression language statement. To show users the variables that are within the scope of a particular processor a convenience feature for code completion could be provided to support variable lookup.

Questions

Below is a list of questions to be addressed as a result of this requirements document:

Question	Outcome

Space shortcuts

Child pages