Status

Current state: Under discussion

Vote thread: 

Discussion thread: https://lists.apache.org/thread/cwhkf26np3kjt33yhlsvp3k3jf7ofgf3

JIRA: KAFKA-19104 - Getting issue details... STATUS

Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).

Motivation

The "metadata version" of a Kafka cluster is an integer which reflects what features can be expressed in the current cluster metadata format. It is described in more detail in KIP-778.  Currently, clusters can be created with any supported metadata.version. Clusters can also be upgraded at runtime from an earlier metadata.version to a later one. However, we do not support downgrades.

Newer versions of Kafka support more metadata versions than older versions. Therefore, upgrading to a new metadata version closes off the possibility of downgrading to an older version of the software.

We would like to be able to support metadata version downgrade in order to give users more flexibility. They could downgrade their metadata version in order to avoid a bug that was specific to a newer metadata version. They could also downgrade their metadata version in order to downgrade their software to an older version.

Public Interfaces

updateFeatures

The AdminClient updateFeatures API already specifies how the user can request a metadata version downgrade. Currently it returns an error saying that downgrade is unsupported.

We do not need to change the interface at all for updateFeatures. We just need to make it work.

kafka-features.sh

The kafka-features.sh command already specifies a way to downgrade the metadata version, so no changes are needed here either.

New metadata version

There is a new metadata version associated with KIP-1155. Every broker that supports this metadata version is assumed to support metadata version downgrade.

This is true whether or not the new metadata version is active. Merely supporting the new metadata version is enough to give us the signal we need.

Proposed Changes

Initiating the downgrade

When the administrator wants to downgrade the metadata version, they invoke the updateFeatures API. Perhaps via a tool like kafka-features.sh, or perhaps through a program they wrote that uses the AdminClient.

When the active controller receives the updateFeatures RPC, it will check whether all the controllers and brokers in the cluster support the new metadata version described above. If any of them do not support the new metadata version, we return INVALID_UPDATE_VERSION, with an error message identifying the server that doesn't support downgrade. It will also check that the requested metadata version is compatible with all the other KIP-584 features that are enabled.

Downgrading to a metadata.version level older than 3.7 will not be supported. The reason for this is that that is the first version that supported  KIP-919 controller registrations, and we rely on the controller registrations to check that a proposed new MV or other feature is supported by all controllers. Since this is a fairly old MV at this point, this shouldn't be a significant restriction.

If all of these checks succeed, the active controller will handle the downgrade operation by emitting a FeatureLevelRecord for metadata.version with the new feature level. If the updateFeatures RPC specified multiple features, the metadata version downgrade record will be emitted last.

Handling the downgrade

When a broker or controller replays the new FeatureLevelRecord telling it to change the metadata.version, it will immediately trigger a writing a __cluster_metadata snapshot with the new metadata version. If there are metadata losses during this process, they will be logged.

The MetadataImage pipeline will then load this snapshot. The MetadataDelta should contain only the things that changed since the previous snapshot. (There are a few cases where things are appearing in MetadataDelta even if they haven't changed since the last snapshot – we should fix this.)

On the controller side, we will clear the current state and reload all records from the new snapshot.

Lossy versus lossless downgrades

If the downgrade was specified by the administrator as FeatureUpdate.UpgradeType.SAFE_DOWNGRADE, we will check to see if anything was lost when writing the image at the lower metadata version. If there is, we will abort the downgrade process. (Note that this obviates the current need to specify a didMetadataChange boolean for each new metadata version in MetadataVersion.java.)

Downgrades specified with type FeatureUpdate.UpgradeType.UNSAFE_DOWNGRADE will proceed even if metadata is lost during the downgrade.

Monitoring

KIP-938 already added the CurrentMetadataVersion metric, which administrators can monitor if they want to observe the metadata version change taking place. No additional metrics are needed. It is also of course possible to use describeFeatures (or kafka-features.sh) to query the current metadata version.

Compatibility, Deprecation, and Migration Plan

As described above, we can use the advertised metadata version range to detect pre-KIP-1155 servers that don't support metadata version downgrade. If there are any of those, we will abort the downgrade. This ensures that we will not create an unsafe situation in clusters with mixed software versions.

Test Plan

This feature will be tested with unit and integration tests in junit, as well as some end-to-end ducktape tests.

Rejected Alternatives

New fields in broker and controller registrations

Rather than using a new metadata version to indicate support for downgrades, we could add a field to the broker registration. However, this runs into the problem that older metadata versions would not support the new field. Therefore controllers operating at these older metadata versions would not be able to check this information. Adding a new RPC field is also more work than just adding a new metadata version.

Avoiding reloading the metadata image

You could imagine a system for downgrading the metadata version that didn't rely on reloading the metadata image. Instead, we could just make whatever modifications were needed to the in-memory data structures when downgrading from some metadata version X to an older version Y. Indeed, this was the original system that we envisioned in KIP-778.

The problem is that this system becomes too difficult to maintain as the number of metadata versions grows. Since this code is used so rarely, it is also a very likely place for bugs to hide.

The proposal above avoids these problems. It only requires that we have a way of writing a metadata image at a given metadata version, and a means of loading the image. In big-O terms, rather than requiring O(N^2) work to support converting between each pair of versions, we only require O(N) work.

Another reason to create a snapshot as soon as the downgrade is done is that if the user wants to downgrade the software version, they will need this. If the user restarts with an older software version but no snapshot at this older version, the older software may choke on the new records which are still present in the latest snapshot.

  • No labels