Current state: Accepted
Discussion thread: here
Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).
When Kafka is configured to use SSL, the broker will typically support multiple cipher suites. During the SSL handshake, the client and the server negotiate which cipher suite gets used by the connection.
The system administrator might want to know what SSL cipher suites are in use. Some cipher suites are not very secure, and it would be good to verify that they have not been accidentally enabled. On the other hand, disabling cipher suites that use a lot of system resources could improve system performance.
However, currently, it is difficult to know what cipher suite is actually in use. There are log messages that describe it, but they are logged at DEBUG or TRACE level, and the information is not aggregated anywhere. We would like to create a metric to address this gap.
We will add a new metric in the Selector to surface information about what cipher suites are in use. The mbean will be:
It will contain a single value named "connections." This will contain the number of currently open connections using the given SSL cipher type and protocol. If the number of connections drops to 0, the mbean will be removed.
A typical example of how this will look:
Note that listeners that don't use SSL will not get any additional metrics.
Compatibility, Deprecation, and Migration Plan
There should be no compatibility impact, since this is only adding a new metric.
Rather than creating a metric per Selector, we could create a global metric describing how many connections used a given SSL cipher suite. This might result in fewer metrics to track.
However, on reflection, it seemed better to continue the existing pattern of tracking connection metrics per selector. It is more consistent with the current set of metrics. Also, there may also be cases where the system administrator might want to know the differences in what cipher suites are used by different listeners or even different selectors. Aggregating everything would remove this information. Finally, aggregating the metrics would require more synchronization between threads in Kafka-- potentially degrading performance.