DUE TO SPAM, SIGN-UP IS DISABLED. Goto Selfserve wiki signup and request an account.
Status
Current state: Voting
Discussion thread: https://lists.apache.org/thread/kc87jkgyvf9x7nwmgwrhx6fs6w0tqymj
Vote thread: https://lists.apache.org/thread/rdpjmmqdxzog2m555r2wrncfn40zjf54
JIRA:
KAFKA-20395
-
Getting issue details...
STATUS
Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).
Motivation
Kafka cluster membership is managed by the metadata layer, and consist mainly of two things: broker and controller registrations. Registrations are sent by brokers and controllers to the active controller via the BrokerRegistrationRequest and ControllerRegistrationRequest RPCs, and are persisted to the metadata log via the BrokerRegistrationRecord and ControllerRegistrationRecord.
The active controller uses registrations to determine the current members of the Kafka cluster, and is aware of each members' advertised listeners and supported feature ranges. Registration information is used for determining if all cluster members support a feature level during upgrades.
However, there is currently no way to unregister a controller, like there is for brokers via UnregisterBrokerRequest and UnregisterBrokerRecord. This means stale controller registrations can block feature upgrades. This KIP proposes adding support for operators to manually unregister controllers like they can with brokers.
One important use case for this KIP is to remove controller registrations from KRaft observers (i.e. nodes that replicate the log but do not participate in leader election or committing data) from the metadata log. This means operators can remove these stale registrations to unblock feature upgrades on their cluster.
Public Interfaces
New RPC
UnregisterControllerRequest
{
"apiKey": 93,
"type": "request",
"listeners": ["broker", "controller"],
"name": "UnregisterControllerRequest",
"validVersions": "0",
"flexibleVersions": "0+",
"fields": [
{ "name": "ControllerId", "type": "int32", "versions": "0+",
"about": "The controller ID to unregister." }
]
}
This request can return the following errors:
- an
UNSUPPORTED_VERSIONerror if the cluster's MetadataVersion does not supportUnregisterControllerRecord - a
BROKER_ID_NOT_REGISTEREDerror if no registration exists for the requested controller ID - a
NOT_CONTROLLERerror if the request does not arrive at the active controller
UnregisterControllerResponse
{
"apiKey": 93,
"type": "response",
"name": "UnregisterControllerResponse",
"validVersions": "0",
"flexibleVersions": "0+",
"fields": [
{ "name": "ThrottleTimeMs", "type": "int32", "versions": "0+",
"about": "Duration in milliseconds for which the request was throttled due to a quota violation, or zero if the request did not violate any quota." },
{ "name": "ErrorCode", "type": "int16", "versions": "0+",
"about": "The error code, or 0 if there was no error." },
{ "name": "ErrorMessage", "type": "string", "versions": "0+", "nullableVersions": "0+",
"about": "The top-level error message, or `null` if there was no top-level error." }
]
}
Public APIs
Admin.java
/**
* Unregister a controller.
*
* This is a convenience method for {@link #unregisterController(int, UnregisterControllerOptions)}
*
* @param controllerId the controller id to unregister.
*
* @return the {@link UnregisterControllerResult} containing the result
*/
default UnregisterControllerResult unregisterController(int controllerId) {
return unregisterController(controllerId, new UnregisterControllerOptions());
}
/**
* Unregister a controller.
*
* The following exceptions can be anticipated when calling {@code get()} on the future from the
* returned {@link UnregisterControllerResult}:
* <ul>
* <li>{@link org.apache.kafka.common.errors.TimeoutException}
* If the request timed out before the describe operation could finish.</li>
* <li>{@link org.apache.kafka.common.errors.UnsupportedVersionException}
* If the software is too old to support the unregistration API.
* </ul>
* <p>
*
* @param controllerId the controller id to unregister.
* @param options the options to use.
*
* @return the {@link UnregisterControllerResult} containing the result
*/
UnregisterControllerResult unregisterController(int controllerId, UnregisterControllerOptions options);
New Metadata Record
UnregisterControllerRecord
{
"apiKey": 29,
"type": "metadata",
"name": "UnregisterControllerRecord",
"validVersions": "0",
"flexibleVersions": "0+",
"fields": [
{ "name": "ControllerId", "type": "int32", "versions": "0+",
"about": "The controller id." }
]
}
This KIP will also introduce a new MetadataVersion to support this new metadata record.
CLI changes
kafka-cluster
Add a unregister-controller for manually unregistering controllers. This command would be similar to how the unregister command works for brokers. Below is an example invocation.
kafka-cluster unregister-controller --controller-id 9990
When the user executes this command to unregister controller 9990:
UnregisterControllerRequestis sent to the active controller- The active controller writes an
UnregisterControllerRecordto the metadata log - When this record is committed, return a response to the user for unregistering the controller
- The active controller's state machine removes the registration for controller 9990, meaning feature upgrades no longer consider node 9990's supported features
- The registration from controller 9990 is removed from the metadata image
kafka-cluster unregister-controller is a command for users when they want to unregister a controller from the cluster. This command should only be run after a controller is stopped, and the operator does not intend to bring it back. kafka-cluster unregister-controller works irrespective of the quorum mode.
kafka-metadata-quorum
Add the --unregister flag to the kafka-metadata-quorum remove-controller command. When this flag is set, invoking this command with --unregister set will remove the controller as a KRaft voter and unregister it. Below is an example invocation.
kafka-metadata-quorum remove-controller --controller-id 9990 --controller-directory-id EXAMPLE_UUID --unregister
When the user executes this command, kafka tries to remove 9990 as a voter AND unregister it:
RemoveRaftVoterRequestis sent to the active controller- The KRaft leader writes a
VotersRecordwithout voter 9990 to the metadata log - When this record is committed, return a response to the user for removing the voter
UnregisterControllerRequestis sent to the active controller if steps 1-3 were successful- If steps 1-3 were not successful, return the error and direct the user to use
kafka-cluster unregister-controllerinstead.
- If steps 1-3 were not successful, return the error and direct the user to use
- The active controller writes an
UnregisterControllerRecordto the metadata log - When this record is committed, return a response to the user for unregistering the controller
- The active controller's state machine removes the registration for controller 9990, meaning feature upgrades no longer consider node 9990's supported features
- The registration from controller 9990 is removed from the metadata image
The main use case for this command is to remove a node from the voter set in a dynamic KRaft quorum, AND unregister it from the cluster all within the same CLI command. Since this is a common usage pattern, running this command with --unregister can be thought of as a "built-in" script that provides a smooth UX for decommissioning voters in a dynamic quorum. Running the command with --unregister will fail when the cluster does not support dynamic quorum to be consistent with the behavior of the command when --unregister is not set.
User Experience
The main use cases of these CLI tools are listed below. When trying to unregister a controller, it is assumed that the operator has stopped a node before unregistering it and does not intend to bring that node back in the near future. This is because after unregistering a node, the active controller no longer checks its supported feature levels when validating a feature upgrade.
Remove and unregister a KRaft voter in a dynamic quorum
- Stop the voter
- Run kafka-metadata-quorum remove-controller with the --unregister flag
Remove a KRaft voter in a dynamic quorum and keep it registered as an observer controller
- Run kafka-metadata-quorum remove-controller without the --unregister flag
Unregister an observer controller in a static or dynamic quorum
- Stop the observer
- Run kafka-cluster unregister-controller
Unregister a voter in a static KRaft quorum when the static voter set is mistakenly configured
- Stop the voter who was mistakenly put in
controller.quorum.voters - Run kafka-cluster unregister-controller
- Ensure the stopped voter is not part of
controller.quorum.voterson every Kafka node
Proposed Changes
Controller Changes
The active controller will handle unregistering a controller in the same way it handles unregistering brokers. After a broker/controller is unregistered, it will no longer be part of the metadata image, so its registration will no longer be a part of subsequent snapshots.
The registration manager of an unregistered controller already attempts to re-register with the active controller. This is to prevent accidental unregistrations. The intention for unregistration is for it to occur after the operator decommissions a controller node. This means the node is no longer part of the kafka cluster, and the controller process associated with this node is not expected to come back.
Metadata Quorum Tool Changes
This new flag means that the remove-controller command could send two RPCs to the active controller, one to remove the node from the KRaft voter set, and another to remove the node's registration. To maintain consistency with this command only being supported with dynamic quorum, running remove-controller --unregister will fail if the cluster does not support dynamic quorum reconfiguration. Instead, the user should be directed to use the kafka-cluster unregister-controller command.
Compatibility, Deprecation, and Migration Plan
Because this KIP is introducing a new metadata record alongside a new MetadataVersion, it means that existing clusters who have a stale controller registration will not be able to unregister it, and unblock feature upgrades thereafter. The main reason for not supporting this in existing clusters is that in many environments, operators can simply bring up another controller node with the same node ID to "refresh" its registration. Additionally, the interest of keeping this design simple, some of the potential workarounds for existing clusters have been moved to the Rejected Alternatives section.
Test Plan
Add an integration test for unregistering a controller in both static quorum and dynamic quorum clusters.
Rejected Alternatives
Use KRaft to manage controller registration
Although KRaft has access to the necessary information to manage controller registrations (i.e. advertised endpoints, and feature versions which are negotiated as part of establishing a connection with another node), this kind of design poses a couple of issues, mainly stemming from the fact that logical cluster membership of brokers and controllers is really a metadata layer responsibility of the active controller. Managing members of the larger kafka cluster is not the responsibility of the KRaft leader. The KRaft leader is responsible for replicating a log, and thus should not be aware of metadata level feature records or registration records. This approach would require leaking metadata layer state + records such as feature versions (outside of the kraft.version).
Brokers and controllers are concepts of the metadata layer, whereas KRaft has the concepts of leader, voters, and observers. This distinction is noteworthy because both brokers and controllers can be observers in KRaft, but broker registrations are not something that should be managed by the KRaft layer.
Do not durably persist observer controller registrations
Although this feature would be nice, since we would couple controller registration with being a KRaft voter, the main issue with this is that it will prevent the proposed improvements from KIP-1141: Simplifying Add/Remove Voter in MetadataQuorumCommand from being implemented. The proposed changes from KIP-1141 improve the UX around adding a controller to the KRaft voter set, but they rely on observer controllers having persisted a registration to retrieve their endpoints.
Use UnregisterBrokerRecord to unregister controllers
This approach would be one way existing clusters could support unregistering stale controller registrations without updating the MV. However, the main issue with this approach is that it is unsafe. A user who unregisters a controller before updating the software versions on all controllers to support this feature would crash the controllers with an older software version.
Non-durably unregister controllers
This would be a workaround for existing clusters with a stale registration, where the active controller's state machine could unregister a controller, but the on-disk data does not change. The main reason against implementing this would be a confusing UX during scenarios with node restarts. Additionally, it is confusing for operators to run the unregister CLI command again after updating the MV to support the UnregisterControllerRecord.
Allow the active controller to unregister observer controllers as part of a MV upgrade
Another approach to clear stale registrations from existing clusters is to allow the active controller to unregister observer controllers that the user specifies as part of a MV upgrade to support UnregisterControllerRecord. The main issue with this approach is that it makes the UX for kafka-features upgrade very complex. Additionally, this approach may be overkill in non-managed deployments, where operators can simply provision a controller to update its stale registration.