Table of Contents |
---|
Status
Current state: Voting in progressAccepted
Discussion thread: here
Vote thread: here
JIRA:
Jira | ||||||
---|---|---|---|---|---|---|
|
PR(draft): https://github.com/apache/kafka/pull/7310
Released: 2.4.0
Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).
...
The newly-proposed connect-protocol
JMX metric can be used to monitor whether internal request verification is enabled for a cluster; if its value is sessioned
(or, presumably, a later protocol), then request verification should be enabled.
Reverting an upgrade
Via connect.protocol
config
The group coordination protocol will be used to ensure that all workers in a cluster support verification of internal requests before this behavior is enabled; therefore, a rolling upgrade of the cluster will be possible. In line with the regression plan for KIP-415: Incremental Cooperative Rebalancing in Kafka Connect, if it is desirable to disable this behavior for some reason, the connect.protocol
configuration can be set to compatible
or default
for one (or more) workers, and it will automatically be disabled.
Via worker version downgrade
It should also be noted that the above will occur automatically if a worker is downgraded to a prior release of Kafka Connect that does not support the sessioned
protocol. If this occurs, the worker will begin emitting error-level log messages when it reads session keys from the config topic. However, the worker will be otherwise unaffected and will continue to function properly (but without the security benefit of internal request verification).
Migrating to a new request signature algorithm
If a new signature algorithm should be used, a rolling upgrade will be possible with the following steps (assuming a new algorithm of HmacSHA489
):
- Add
HmacSHA489
to theinternal.key.verification.algorithms
list for each worker, and restart them one-by-one - Change the
internal.key.signature.algorithm
property for each worker toHmacSHA489
, and restart them one-by-one - (Optional) Remove the old algorithm from the
internal.key.verification.algorithms
list for each worker, and restart them one-by-one
Rejected Alternatives
Configurable inter-worker headers
Summary: A new worker configuration would be added that would control auth headers used by workers when making requests to the internal endpoint.
Rejected because: The additional complexity of another required configuration would be negative for users; security already isn't simple to implement with Kafka Connect, and requiring just one more thing for them to add should be avoided if possible. Also, the use of static headers isn't guaranteed to cover all potential auth mechanisms, and would require manual rotation by reconfiguring the worker.
Replace endpoint with Kafka topic
Summary: The REST endpoint could be removed entirely and replaced with a Kafka topic. Either an existing internal Connect topic (such as the configs topic) could be used, or a new topic could be added to handle all non-forwarded follower-to-leader communication.
Rejected because: Achieving consensus in a Connect cluster about whether to begin engaging in this new topic-based protocol would require either reworking the Connect group coordination protocol or installing several new configurations and a multi-stage rolling upgrade in order to enable it. Requiring new configurations and a multi-stage rolling upgrade for the default use case of a simple version bump for a cluster would be a much worse user experience, and if the group coordination protocol is going to be reworked, we might as well just use the group coordination protocol to distribute keys instead. Additionally, the added complexity of switch from a synchronous to an asynchronous means of communication for relaying task configurations to the leader would complicate the implementation enough that reworking the group coordination protocol might even be a simpler approach with smaller changes required.
Distribute session key via Connect protocol
Summary: Instead of distributing a session key via the config topic, include the session key as part of the worker assignment handed out during rebalance via the Connect protocol. Periodically force a rebalance in order to rotate session keys.
Rejected because: The implementation complexity of adding a session key to the rebalance protocol would be quite high, and the additional API would complicate the code base significantly. Additionally, there are few, if any advantages, compared to distributing the keys via the config topic.