Summary

Apache NiFi is a large project with numerous direct and transitive dependencies. Maintaining security is essential and providing compatibility is an important element of successful releases. Understanding the usage profile and component impact of dependencies is key to managing and upgrading libraries throughout the project. This document outlines a dependency management strategy and recommend steps when preparing to upgrade or reviewing upgrade requests.

Background

Apache Maven provides an Introduction to the Dependency Mechanism that covers the basic structure, scope, and approach to managing project libraries.

Defining the the correct set of dependencies and specifying the appropriate scope is just as important as the code uses external libraries.

All dependencies must adhere to the Apache Software Foundation License Policy for inclusion.

Beyond legal requirements, understanding the complete set of transitive dependencies is essential when introducing or upgrading libraries. The Apache Maven Dependency Plugin has a number of helpful goals for enumerating and analyzing dependencies.

Project Structure

Apache NiFi uses a monorepo strategy for managing project resources. This approach enables streamlined dependency management for numerous components, providing version guarantees and standardization where necessary.

As a large project with many external integrations, Apache NiFi depends on hundreds of libraries, each with various transitive dependencies. Runtime ClassLoader isolation allows different extension components to depend on different versions of the same library, which brings both flexibility and complexity to the dependency management process. Knowing the scope of library usage is important when evaluating whether to upgrade, enforce, or override dependency versions.

Project Modules

The location of subproject module hierarchies should be noted when evaluating dependency changes.

Managed Dependencies

The Apache Maven dependencyManagement element enables parent projects to set dependency versions for child projects. Custom project properties enable a single version to be set and referenced. Many libraries include a shared set of transitive or sibling dependencies, which can be managed through a single Bill of Materials dependency. Selective use of these features provides a consistent and maintainable strategy.

Project properties for version numbers should be used for ease of reference and avoiding duplication.

Bill of Materials dependencies should be imported whenever there is more than one supporting library for a particular dependency. Bill of Materials dependencies also ensure consistency for direct and transitive dependencies, which helps avoid runtime errors due to mismatched minor versions.

Banned Dependencies

The Apache Maven Enforcer Plugin is an important part of dependency standardization as it supports disallowing selected artifacts and versions from project modules. The Banned Dependencies rule should be used to ensure that specified libraries will not be included through transitive references. This is an important part of maintaining project security as it guarantees that a direct dependency reference cannot introduce a new transitive dependency marked as banned.

Root Project Configuration

The Apache NiFi root Maven configuration uses shared dependency management concepts to set versions across the repository. Changing versions at this level has a broad impact for framework features, subprojects, and extension components. Promoting shared dependency management to the root Maven configuration should be considered when multiple subprojects share a dependency.

The root Maven configuration sets both the default version and scope for managed dependencies. In some cases, it may be necessary to override the scope of certain dependencies, so it is important to set values that apply to the largest number of modules.

In addition to managing dependency versions, the root Maven configuration also defines a set of default dependencies for all child modules. These dependencies should be limited to the test scope to avoid runtime impacts. Having JUnit and Mockito libraries defined in the list of default dependencies also avoids the need to declare them in child projects.

Bundle Project Configuration

Apache NiFi extension components follow standard structural conventions for Maven modules. A parent bundle project defines several child projects, with separate modules for interfaces, implementations, and NAR packages.

The bundle project provides a natural place for defining common versions of shared libraries. Rather than defining the same version number or even using a shared version property across child modules, the dependencyManagement element should be used in the bundle project to set consistent versions for child projects. The bundle project is the best place to define shared dependency version information when the usage of the dependency is limited to a particular extension.

Upgrade Process

Upgrading dependencies is essential for maintaining security and compatibility. The specific level of effort varies from library to library, but there are several steps that should be followed for all dependency upgrades.

Standard Steps

  1. Evaluate project versioning strategy
  2. Review project release notes
  3. Determine level of changes
  4. Review direct dependency references
  5. Review transitive dependency references
  6. Run local build
  7. Run GitHub automated builds in forked repository

Evaluate Project Versioning Strategy

Projects follow various versioning strategies, which often complicates the upgrade process. Understanding project versioning strategy is important when evaluating changes.

Many projects adhere to Semantic Versioning for project releases. Semantic Versioning aims to provide a common approach to version numbering in a way that communicates the type of changes included. Major versions can include some number of breaking changes, whereas minor and patch versions should never include breaking changes.

Not all projects following Semantic Versioning, and sometimes projects can introduce unintended breaking changes.

Having a basic understanding of what to expect in various version upgrades helps to evaluate declared and undeclared changes.

Review Project Release Notes

Release notes are an important means of communicating changes. Noting breaking changes is important when reviewing release notes, as those changes may or may not impact current use cases.

For projects that follow Semantic Versioning, reviewing deprecated features is an important part of the upgrade process. An upgrade between minor versions may not break existing code, but it may introduce deprecations that should be addressed.

Determine Level of Changes

Release notes may not always provide sufficient detail to understand the scope of changes in a particular version. For projects that do not follow Semantic Versioning, it may be necessary to review commit logs to get a general understanding of what changed from version to version.

Determining whether a version upgrade is limited to bug fixes, or whether it introduces breaking changes and feature deprecation, is important before moving forward. Limited bug fixes and improvements should not require direct code changes, whereas more significant changes may involve both test and runtime adjustments.

Review Direct Dependency References

Reviewing direct dependency references can be done through a simple search or using the Maven Dependency plugin. This review should highlight whether the upgrade impact is limited to a particular extension component, or whether it applies to multiple modules across the repository.

Direct dependency references that are limited to extension components help focus the amount of testing necessary. When multiple subprojects or components use the same dependency, more runtime testing may be necessary to ensure compatibility after upgrading.

Review Transitive Dependency References

Understanding the scope of transitive dependency usage can be difficult because it is not identified explicitly in Maven module configurations. The Maven Dependency plugin includes goals to analyze dependencies and produce usage reports. For upgrades that involve major version changes, running and reviewing automated reports can help evaluate the scope of impacted modules.

Some components include transitive dependency exclusions or overrides that should be evaluated during the upgrade process. Transitive version overrides require very careful evaluation because the change is outside of the scope of the dependency's expected configuration. This may not be a problem for minor changes or patch versions, but changes to transitive dependency overrides should include runtime testing to avoid unexpected errors.

Run Local Build

Running a local build is the first step to determining the real impact of a dependency upgrade. The default build process runs unit tests for the majority of subprojects and extensions, but does not include every module in the repository.

The Maven Profiles section of the Apache NiFi Developer Guide lists optional profiles that can be specified for complete evaluation.

Differences in operating system and Java version influence the behavior of certain tests, so a successful local build is not a complete guarantee of compatibility.

Run GitHub Automated Builds in Forked Repository

GitHub Actions provides an automated workflow execution for project builds across multiple platforms. The Continuous Integration Workflow defines different operating system and Java versions for automated execution, which helps avoid introducing problems specific to a particular platform.

Automated builds should pass on all platforms when upgrading dependencies.

Additional Steps

Although some dependencies may be limited to unit testing, most upgrades involve changing runtime behavior. For this reason, additional testing is often necessary when upgrading dependencies.

  1. Run impacted projects
  2. Run system tests
  3. Run functional tests

Run Impacted Projects

For dependencies that support core framework features, starting and running the associated project provides an important verification of behavior.

This includes Apache NiFi as well as NiFi Registry, NiFi Stateless, and MiNiFi, along with toolkit components.

Run System Tests

The nifi-system-tests module includes a number of integration tests that exercise various features of standalone and clustered instances of Apache NiFi. The module also includes integration tests for NiFi Stateless. These tests provide a helpful set of checks beyond basic startup and shutdown of the system.

Run Functional Tests

In some instances, there is no substitute for exercising components with a running configuration. Understanding the scope of dependency usage can help tailor particular functional tests, identifying whether specific properties are necessary to exercise new features or fixes.

  • No labels