DUE TO SPAM, SIGN-UP IS DISABLED. Goto Selfserve wiki signup and request an account.
Status
Current state: "Voting"
Vote thread: here
Discussion thread: here
JIRA: https://issues.apache.org/jira/browse/KAFKA-18775
Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).
Motivation
Currently, when using MetadataQuorumCommand to add a controller, users must provide a controller.properties configuration file. This file is required for the command to retrieve the metadata local path and endpoints needed to add voters. However, this approach has several limitations:
- Limited Accessibility: The node executing the tool must have direct access to the metadata path of the node being added or removed. This restricts the ability to use node A to manage node B, as node A may not have access to the metadata folder on node B.
Dependency on Node Configuration: The tool requires access to the configuration of the node being managed.
However, the essential information for these operations — the directory UUID and endpoints — is already available from the active controller’s in-memory state and the ClusterImage.
Leveraging these sources allows us to simplify voter addition and removal, enabling the command to run without direct access to the target node’s metadata directory.
Public Interfaces
CLI
Adding a controller
For adding a controller, introduces a new option —-controller-id for the add-controller subcommand.
bin/kafka-metadata-quorum.sh --bootstrap-server localhost:9092 add-controller --controller-id <id>
bin/kafka-metadata-quorum.sh --bootstrap-controller localhost:9093 add-controller --controller-id <id>
Removing a controller
For removing a controller, the —-controller_directory_id option is no longer required.
bin/kafka-metadata-quorum.sh --bootstrap-server localhost:9092 remove-controller --controller-id <id>
bin/kafka-metadata-quorum.sh --bootstrap-controller localhost:9093 remove-controller --controller-id <id>
Public APIs
Admin.java
/**
* Add a new voter node to the KRaft metadata quorum.
*
* Note that this is a convenient method and not idempotent.
* For a complicated scenario, e.g., Node Disk Failure, there might have
* observers with different directory uuid but the same node id.
* In this scenario, please go with {@link #addRaftVoter(int, Uuid, Set)}.
*
* @param voterId The node ID of the voter.
*/
default AddRaftVoterResult addRaftVoter(int voterId) {
return addRaftVoter(voterId, Uuid.ZERO_UUID, Set.of(), new AddRaftVoterOptions());
}
/**
* Remove a voter node from the KRaft metadata quorum.
*
* @param voterId The node ID of the voter.
*/
default RemoveRaftVoterResult removeRaftVoter(int voterId) {
return removeRaftVoter(voterId, Uuid.ZERO_UUID, new RemoveRaftVoterOptions());
}
RPC Changes
AddRaftVoterRequest.json
diff --git a/clients/src/main/resources/common/message/AddRaftVoterRequest.json b/clients/src/main/resources/common/message/AddRaftVoterRequest.json
index 74b7638ea2..27a6e5face 100644
--- a/clients/src/main/resources/common/message/AddRaftVoterRequest.json
+++ b/clients/src/main/resources/common/message/AddRaftVoterRequest.json
@@ -18,7 +18,7 @@
"type": "request",
"listeners": ["controller", "broker"],
"name": "AddRaftVoterRequest",
- "validVersions": "0-1",
+ "validVersions": "0-2",
"flexibleVersions": "0+",
"fields": [
{ "name": "ClusterId", "type": "string", "versions": "0+", "nullableVersions": "0+",
RemoveRaftVoterRequest.json
diff --git a/clients/src/main/resources/common/message/RemoveRaftVoterRequest.json b/clients/src/main/resources/common/message/RemoveRaftVoterRequest.json index 7d11086e53..2181ecd9ff 100644 --- a/clients/src/main/resources/common/message/RemoveRaftVoterRequest.json +++ b/clients/src/main/resources/common/message/RemoveRaftVoterRequest.json @@ -18,14 +18,14 @@ "type": "request", "listeners": ["controller", "broker"], "name": "RemoveRaftVoterRequest", - "validVersions": "0", + "validVersions": "0-1", "flexibleVersions": "0+", "fields": [
Proposed Changes
Server side changes
- During
AddRaftVoterRequesthandling, if api version >= 2,- the voter directory id is derived from in-memory
LeaderStatewhen the value is Uuid.ZERO_UUID, - the controller endpoints are derived from
ClusterImageif endpoint set is empty, note that the ClusterImage may lag behind actual state, so endpoints are not strictly idempotent.
- the voter directory id is derived from in-memory
During
AddRaftVoterRequesthanding, if multiple observers share the same node ID, reject withIllegalStateExceptionindicating the duplicate node ID and instruct the user to resolve the conflict.- During
RemoveRaftVoterRequesthanding, if api version >=1, the voter directory id is derived from in-memoryLeaderStatewhen the value is Uuid.ZERO_UUID.
Client side changes
- Two convenience methods for adding and removing controllers have been introduced in
Admin.java, addRaftVoter documented with Javadoc warnings about idempotency risks, and are intended for use only when the user understands and accepts those risks.
MetadataQuorumCommand add-controller changes
Add a new option —-controller-id to add-controller subcommand.
addControllerParser
.addArgument("--controller-id", "-i")
.help("The id of the controller to add. This option should be used with bootstrap controller.")
.type(Integer.class)
.action(Arguments.store());
- If
—-controller-idis provided, invoke new method Admin#addRaftVoter(int) If
—-command-configand—-controller-idare both provided, the config file provided by—-command-configwill only be applied in Admin client initialization.- the description for
—-command-configwill be changed to "Property file containing configs to be passed to Admin Client. For add-controller, the file is used to specify the controller properties as well unless --controller-id is provided."
- the description for
If neither
—-command-confignor—-controller-idis provided, an exception will be thrown:throw new TerseException("You must use --command-config or --controller-id option to add a controller.");
MetadataQuorumCommand remove-controller changes
diff --git a/tools/src/main/java/org/apache/kafka/tools/MetadataQuorumCommand.java b/tools/src/main/java/org/apache/kafka/tools/MetadataQuorumCommand.java
index dba7951aa4..f3bdbbeffa 100644
--- a/tools/src/main/java/org/apache/kafka/tools/MetadataQuorumCommand.java
+++ b/tools/src/main/java/org/apache/kafka/tools/MetadataQuorumCommand.java
@@ -471,7 +471,6 @@ public class MetadataQuorumCommand {
removeControllerParser
.addArgument("--controller-directory-id", "-d")
.help("The directory ID of the controller to remove.")
- .required(true)
.action(Arguments.store());
The
—-controller-directory-idis no longer required, we can leverage on the new method Admin#removeRaftVoter(int)If
—-controller-directory-idis explicitly provided, invoke Admin#removeRaftVoter(int, Uuid)
Compatibility, Deprecation, and Migration Plan
This KIP introduces new methods in Admin.java and with directory uuid and endpoints fields optional for 2 RPCs with no breaking change.
And the CLI changes are also backward compatible:
The
—-command-configoption remains available inadd-controller.The
--controller-directory-idoption inremove-controlleris now optional but still supported.
Test Plan
New test cases will be added to MetadataQuorumCommandTest.java to validate:
Adding a controller with
--controller-id.Removing a controller without explicitly providing
--controller-directory-id.
Integration tests will be added for the two new methods in Admin.java.
Rejected Alternatives
- Deprecate
—-command-configoption inadd-controllerand--controller-directory-idoption inremove-controller.The main reason not to deprecate these two parameters is that they were only just introduced in 4.0, so deprecating them in a 4.x release feels a bit too soon. Also, the
--command-configcan be used in a different user scenario, where the user can still provide the configuration file toadd-controllerif they already have it locally. Using admin APIs to retrieve directory UUID and controller endpoints, but this brings extra network communication overhead.
- The
Admin#describeMetadataQuorummethod can provide the directory UUID. The
Admin#describeConfigsmethod, utilizing thebootstrap.controlleraddress, can be used to retrieve the necessary endpoints.
- The