Current state: Under Discussion
Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).
Operators of Apache Kafka clusters have literally no information about the clients connected to their clusters. Having basic information about the connected clients such as their name and their version could tremendously help them to 1) troubleshoot misbehaving clients; or 2) understand the impact of a broker upgrade to their clients and reach them out to inform them proactively.
ApiVersionsRequest is bumped to version 3 with two new fields.
ApiVersionsResponse is bumped to version 3 but does not have any changes in the schema.
We will add few metrics in the broker to surface information about the connected clients.
|Metric||Type||Description||Can be plotted?|
|kafka.server:type=ClientMetrics,name=ConnectedClients||Gauge<Integer>||The total number of client connected.||Yes|
|kafka.server:type=ClientMetrics,name=ConnectedClients,clientname=([-.\w]+),clientversion=([-.\w]+)||Gauge<Integer>||The number of client connected, broken down by clientname and clientversion. It gives an overview of the clients.|
The metric will be removed when it goes back to zero - when the all the clients with a given name and version are disconnected.
The clients connected to the broker where each Map represents a connection with the following metadata:
|No - Operator can get the active connections via JMX by using a tool such as jmxterm|
While the Request Log is not a public interface, it is worth mentioning that we will enrich it with the Client Name and the Client Version.
The idea is to re-use the existing ApiVersions Request to provide the name and the version of the client to the broker.
The client is responsible to provide its name and its version. For the Java client, we will rely on the metadata defined in the `kafka/kafka-version.properties` property file which is shipped in the jar.
In the broker, the name and the version of the client will be attached to the connection alongside already existing metadata such as the principal. This makes them reusable to later purposes (e.g. extending existing metrics).
Metrics will be added and the request log will be extended to make the metadata available to the operators.
Compatibility, Deprecation, and Migration Plan
What impact (if any) will there be on existing users?
Existing users extracting and parsing the Request Log may have to update their parsing logic to accommodate the new fields.
Put clientName and clientVersion in the RequestHeader
clientName and clientVersion could be sent in every request alongside to the clientId in the header. While this would be fairly simple to implement once KIP-482 is implemented, it would make adding more metadata in the future hard and would wast few bytes in every request for something which does not change within a session.
Put clientName and clientVersion in the RequestHeader but provide it only once
clientName and clientVersion could be added to the RequestHeader but sent only in the first request to save bytes in the subsequent requests. Concretely, it means sending it in the ApiVersionsRequest in order to have the info as soon as possible in the broker. Why not putting it in the ApiVersionsRequest directly? Moreover, it would make the implementation of a client ambiguous.
Add a new request to communicate the client metadata to the broker
Instead of piggy backing on the ApiVersionsRequest, we could implement a new Request/Response only for this purpose. This request would need to be sent as early as possible when the connection is established in order to have the information in the broker. Concretely, it means that it would be sent right after the ApiVersionsRequest/Response round trip and before any other request is sent. It would add another round trip to the broker before the client can proceed with its regular stuff. It also would require to be done before the authentication (TLS AuthN aside) and thus requiring specific treatment, similarly to the ApiVersionsRequest.