Current state: Under Discussion
Discussion thread: here
Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).
Kafka exposes many pluggable API for users to bring their custom plugins. For complex and critical plugins it's important to have metrics to monitor their behavior. Plugins wanting to emit metrics can use the Metrics class from the Kafka API but when creating a new Metrics instance it does not inherits the tags from the component it depends on (for example from a producer for a custom partitioner), or the registered metrics reporters. As most plugins are configurable, a workaround is to reimplement the metric reporters logic and in some case for tags too but that is cumbersome. Also by creating a separate Metrics instance, these metrics are separate from the client's and in case multiple clients are running in the same JVM, for example multiple producers, it can be hard to identify the specific client that is associated with some plugin metrics.
This issue also applies to connectors and tasks in Kafka Connect. For example MirrorMaker2 creates its own Metrics object and has logic to add the metric reporters from the configuration.
In this proposal, a "plugin" is an interface users can implement and that is instantiated by Kafka. For example, a class implementing org.apache.kafka.server.policy.CreateTopicPolicy is considered a plugin as it's instantiated by brokers. On the other hand a class implementing org.apache.kafka.clients.producer.Callback is not considered a plugin as it's instantiated in user logic.
I propose introducing a new interface:
Monitorable. If a plugin implements this interface, the
withPluginMetrics() method will be called when the plugin is instantiated (after
configure() if the plugin also implements
Configurable). This will allow plugins to adds their own metrics to the component (producer, consumer, etc) that instantiated them.
The PluginMetrics interface has methods to add and remove metrics and sensors. Plugins will only be able to remove metrics and sensors they created. Metrics created via this class will have their group set to "plugins" and include tags that uniquely identify the plugin.
The PluginMetrics interface implements Closeable. Calling the close() method removed all metrics and sensors created by this plugin. It will be the responsibility of the plugin that create metrics to call close() of their PluginMetrics instance to remove their metrics.
New methods will be added to AbstractConfig so new plugin instances implementing Monitorable can get a PluginMetrics instance.
The Converter interface will extend Closeable (like HeaderConverter already does), the logic of the Converter class stays untouched. The Connect runtime will be updated to call close() when the converter instances can be released.
When instantiating a class, if it implements
withPluginMetrics() method will be called. If the class is also
withPluginMetrics() will be always called after
configure(). Metrics registered by plugins will inherit the prefix/namespace from the current
Metrics instance, these are: kafka.producer, kafka.consumer, kafka.connect, kafka.streams and kafka.server. Tags will be added to metrics and sensors tags created via the PluginMetrics interface to unique identify each instance.
For all plugins apart from Connectors, Tasks, Converters, Transformations and Predicates, a tag containing the configuration name (config) will be added. For example, metrics registered by a custom Serializer named MySerializer configured via key.serializer will have the following name: kafka.producer:type=plugins,client-id=producer-1,config=key.serializer,class=MySerializer,
For Connectors and Converters, the name of the connector (connector) will be added as a tag. Tasks will add the connector name and the task id (task) added as tags. Transformations and Predicates will have the connector name, the task id and their alias (alias) added as tags. For example for a task: kafka.connect:type=plugins,class=MyTask,connector=my-sink,task=0
For configurations that accept a list of classes, for example interceptor.classes, if the same class is provided multiple times, their metrics may collide. This is deemed highly unlikely to occur as there are no use cases where providing multiple times the same class is useful.
This proposal supersedes KIP-608 which only aimed at providing this functionality to Authorizer implementations. KIP-608 was adopted in 2020 but never implemented.
The goal is allow this feature to be used by all plugins that are Closeable, AutoCloseable or have a close() methods apart from MetricsReporter since instances are created before the Metrics instance. Also all Connect connector plugins will be supported. The javadoc of all supported plugins will be updated to mention they are able to implement the Monitorable interface to define their own metrics.
- ConfigProvider: config.providers
- AuthenticateCallbackHandler: sasl.client.callback.handler.class, sasl.login.callback.handler.class, sasl.server.callback.handler.class
- Login: sasl.login.class
- SslEngineFactory: ssl.engine.factory.class
- KafkaPrincipalBuilder: principal.builder.class
- ReplicaSelector: replica.selector.class
- AlterConfigPolicy: alter.config.policy.class.name
- Authorizer: authorizer.class.name
- ClientQuotaCallback: client.quota.callback.class
- CreateTopicPolicy: create.topic.policy.class.name
- RemoteLogMetadataManager: remote.log.metadata.manager.class.name
- RemoteStorageManager: remote.log.storage.manager.class.name
- Serializer: key.serializer, value.serializer
- Partitioner: partitioner.class
- ProducerInterceptor: interceptor.classes
- Deserializer: key.deserializer, value.deserializer
- ConsumerInterceptor: interceptor.classes
- ConnectorClientConfigOverridePolicy: connector.client.config.override.policy
- Converter: key.converter, value.converter
- HeaderConverter: header.converter
- ConnectRestExtension: rest.extension.classes
- Serde: default.key.serde, default.list.key.serde.inner, default.list.key.serde.type, default.list.value.serde.inner, default.list.value.serde.type, windowed.inner.class.serde
The following plugins will not support this feature:
|KafkaMetricsReporter||kafka.metrics.reporters||This interface is technically not part of the public API. Since MetricsReporter will not support it, it makes sense to not add it to this one either.|
|ConsumerPartitionAssignor||partition.assignment.strategy||This interface does not have a close() method.|
|DeserializationExceptionHandler||default.deserialization.exception.handler||This interface does not have a close() method.|
|ProductionExceptionHandler||default.production.exception.handler||This interface does not have a close() method.|
|TimestampExtractor||default.timestamp.extractor||This interface does not have a close() method.|
|RocksDBConfigSetter||rocksdb.config.setter||This interface does not have a close() method.|
|SecurityProviderCreator||security.providers||This interface does not have a close() method. The instance is passed to java.security.Security and we don't control its lifecycle.|
|ReplicationPolicy||replication.policy.class||MirrorMaker currently uses its own mechanism to emit metrics. This interface does not have a close() method.|
|ConfigPropertyFilter||config.property.filter.class||MirrorMaker currently uses its own mechanism to emit metrics.|
|GroupFilter||group.filter.class||MirrorMaker currently uses its own mechanism to emit metrics.|
|TopicFilter||topic.filter.class||MirrorMaker currently uses its own mechanism to emit metrics.|
|ForwardingAdmin||forwarding.admin.class||MirrorMaker currently uses its own mechanism to emit metrics.|
If we decide that some of these plugins would benefit from being able to emit metrics, we could make the necessary API changes in the future to support them.
For example if we create a custom ProducerInterceptor
If the producer using this plugin has its client-id set to producer-1, the metrics created by this plugin will have the following name: kafka.producer:type=plugins,client-id=producer-1,config=interceptor.classes,class=MyInterceptor, and these attributes: rate and total.
Compatibility, Deprecation, and Migration Plan
This is a new feature so it has no impact on deprecation. The only significant API change is for Converter that will now be Closeable. The default close() method should ensure that existing implementations still work.
This feature will be tested using unit and integration tests. For each supported plugin, we will verify they can implement the withPluginMetrics() and that metrics have the correct tags associated.
- Create a dedicated Metrics instance for plugins: A dedicated instance could have its own prefix/namespace (for example kafka.consumer.plugins). This would allow grouping metrics from all plugins but it requires instantiating another Metrics instance and new metrics reporters.
- Let plugins create their own Metrics instance: Instead of passing the Metrics instance to plugins we could pass all the values necessary (metrics reporters, configs, etc ...) to create and configure a Metrics instance. This is impractical as it requires passing a lot of values around and plugins still have to have logic to use them.
- Provide the Metrics instance to Kafka Connect Connectors and Tasks via their context: We would have 2 different mechanisms, one for Connectors/Tasks and one of all other plugins, for the same feature. Also using the Connector and Task contexts has an impact on compatibility.
- Update MirrorMaker to use this new mechanism for creating its metrics. This will cause metrics to have different names. If needed this should be tackled in a separate KIP.