Status
Current state: Accepted
Discussion thread: here
JIRA:
-
KAFKA-16513Getting issue details...
STATUS
Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).
Motivation
The WriteTxnMarkers API was introduced in KIP-98 - Exactly Once Delivery and Transactional Messaging, and was only used for inter-broker communication. This required the ClusterAction permission on the Cluster resource to invoke. This API was originally invoked only by the transaction coordinator to the leader brokers when committing or aborting a transaction.
In KIP-664: Provide tooling to detect and abort hanging transactions, we modified the WriteTxnMarkers API so that it could be invoked externally from the Kafka AdminClient to safely abort a hanging transaction. The permission to call WriteTxnMarkers was not changed.
This means that whilst all the other APIs allowed with the ClusterAction permission on the Cluster [1] are for inter-broker communication only, WriteTxnMarkers is an outlier than can be invoked externally by a Kafka AdminClient. Such usage is more aligned with the Alter permission on the Cluster resource, which includes other administrative actions invoked from the Kafka AdminClient i.e. CreateAcls and DeleteAcls.
We propose allowing the WriteTxnMarkers API to be invoked with the Alter permission on the Cluster to reflect that this API is an admin operation that can be called from the Kafka AdminClient. There is a precedent for a similar change in KIP-320: Allow fetchers to detect and handle log truncation.
Public Interfaces
As part of this KIP, we will allow the WriteTxnMarkers API to be called with Alter permission on the Cluster resource. For backwards compatibility, we will continue to allow the old authorization.
Migration Plan and Compatibility
- The ACL change is backwards compatible because the API will be allowed with both ClusterAction and Alter permissions on the Cluster resource.
- Since backwards compatibility is maintained in this KIP, users have no need to migrate.
Test Plan
We will add unit and integration tests where appropriate to demonstrate that the API can be called with Alter permission on the Cluster resource.
Rejected Alternatives
- Allow this API to be called with Transactional ID ACL - These permissions are used by existing producers. A new operation type can be added, but since aborting a hanging transaction is an administrative action, the proposed approach is more suitable.
Footnotes
[1] Allowed APIs with ClusterAction permission on the Cluster: Fetch (for replication only), LeaderAndIsr, OffsetForLeaderEpoch, StopReplica, UpdateMetadata, ControlledShutdown, WriteTxnMarkers