Status

Current state: Under Discussion

Discussion thread: here

JIRA KAFKA-15995 - Getting issue details... STATUS

Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).

Motivation

Kafka exposes many pluggable API for users to bring their custom plugins. For complex and critical plugins it's important to have metrics to monitor their behavior. Plugins wanting to emit metrics can use the Metrics class from the Kafka API but when creating a new Metrics instance it does not inherits the tags from the component it depends on (for example from a producer for a custom partitioner), or the registered metrics reporters. As most plugins are configurable, a workaround is to reimplement the metric reporters logic and in some case for tags too but that is cumbersome. But this is very brittle as some metrics reporters cause issues when instantiated multiple times. This is the case for example with CruiseControlMetricsReporter. Also by creating a separate Metrics instance, these metrics are separate from the client's, and in case multiple clients are running in the same JVM, for example multiple producers, it can be hard to identify the specific client that is associated with some plugin metrics.

This issue also applies to plugins in Kafka Connect. For example the source and checkpoint MirrorMaker2 connectors create their own Metrics objects and have logic to add the metric reporters from the configuration.

In this proposal, a "plugin" is an interface users can implement and that is instantiated by Kafka. For example, a class implementing org.apache.kafka.server.policy.CreateTopicPolicy is considered a plugin as it's instantiated by brokers. On the other hand a class implementing org.apache.kafka.clients.producer.Callback is not considered a plugin as it's instantiated in user logic.

Public Interfaces

I propose introducing a new interface: Monitorable. If a plugin implements this interface, the withPluginMetrics() method will be called when the plugin is instantiated (after configure() if the plugin also implements Configurable). This will allow plugins to add their own metrics to the component (broker, producer, consumer, etc) that instantiated them.

Monitorable.java
package org.apache.kafka.common.metrics;

public interface Monitorable {

    /**
     * Provides a PluginMetrics instance from the component that instantiates the plugin.
     * PluginMetrics can be used by the plugin to register and unregister metrics
     * at any point in their lifecycle prior to their close method being called.
     * Any metrics registered will be automatically removed when the plugin is closed.
     */
    void withPluginMetrics(PluginMetrics metrics);

}

The PluginMetrics interface has methods to add and remove metrics and sensors. Plugins will only be able to remove metrics and sensors they created. Metrics created via this class will have their group set to "plugins" and include tags that uniquely identify the plugin.

public interface PluginMetrics extends Closeable {

    /**
     * Create a {@link MetricName} with the given name, description and tags. The group will be set to "plugins"
     * Tags to uniquely identify the plugins are automatically added to the provided tags
     *
     * @param name        The name of the metric
     * @param description A human-readable description to include in the metric
     * @param tags        additional key/value attributes of the metric
     */
    MetricName metricName(String name, String description, Map<String, String> tags);

    /**
     * Add a metric to monitor an object that implements {@link MetricValueProvider}. This metric won't be associated with any
     * sensor. This is a way to expose existing values as metrics.
     *
     * @param metricName The name of the metric
     * @param metricValueProvider The metric value provider associated with this metric
     * @throws IllegalArgumentException if a metric with same name already exists.
     */
    void addMetric(MetricName metricName, MetricValueProvider<?> metricValueProvider);

    /**
     * Remove a metric if it exists.
     *
     * @param metricName The name of the metric
     * @throws IllegalArgumentException if the plugin has not already created a metric with this name
     */
    void removeMetric(MetricName metricName);

    /**
     * Create a sensor with the given unique name. The name must only be unique for the plugin, so different plugins can use the same names.
     *
     * @param name The sensor name
     * @return The sensor
     * @throws IllegalArgumentException if a sensor with same name already exists for this plugin
     */
    Sensor sensor(String name);

    /**
     * Remove a sensor (if it exists) and its associated metrics.
     *
     * @param name The name of the sensor to be removed
     * @throws IllegalArgumentException if the plugin has not already created a sensor with this name
     */
    void removeSensor(String name);
}

The PluginMetrics interface implements Closeable, and the close() method deletes all the metrics the instance has created. It is automatically closed when the associated instance is closed by the component that owns it.

The Converter interface will extend Closeable (like HeaderConverter already does), the logic of the Converter class stays untouched. The Connect runtime will be updated to call close() when the converter instances can be released.

Converter.java
public interface Converter extends Closeable {

    @Override
    default void close() throws IOException {
        // no-op
    }
}

Proposed Changes

When instantiating a class, Kafka components will wrap the instance in an internal container class. If the instance implements Monitorable, an PluginMetrics object will be created with the right tags and added to the container class and the withPluginMetrics() method will be called with that object. If the class is also Configurable, withPluginMetrics() will be always called after configure(). Metrics registered by plugins will inherit the prefix/namespace from the current Metrics instance, these are: kafka.producer, kafka.consumer, kafka.connect and kafka.server. Tags will be added to metrics and sensors tags created via the PluginMetrics interface to unique identify each instance. When the component closes the plugin instance, the associated PluginMetrics will also be closed so all metrics registered by the plugin are deleted.

For all plugins apart from Connectors, Tasks, Converters, Transformations and Predicates, the configuration name (config) and the plugin class name (class) will be added as tags. For example, metrics registered by a custom Serializer named MySerializer configured via key.serializer will have the following name: kafka.producer:type=plugins,client-id=producer-1,config=key.serializer,class=MySerializer
For configurations that accept a list of classes, for example interceptor.classes, if the same class is provided multiple times, their metrics may collide. This is deemed highly unlikely to occur as there are no use cases where providing multiple times the same class is useful.

For Connectors, the name of the connector will be added as a tag (connector). For example: kafka.connect:type=plugins,connector=my-sink
For Tasks, the name of the connector (connector) and the task id (task) will be added as tags. For example: kafka.connect:type=plugins,connector=my-sink,task=0
For Transformations and Predicates, the name of the connector (connector), the task id (task) and their alias (transformation/predicate) will be added as tags. For example: kafka.connect:type=plugins,connector=my-sink,task=0,predicate=my-predicate
For Converters, the name of the connector (connector), the task id (task), and their type (iskey, either set to true or false) will be added as tags. For example: kafka.connect:type=plugins,connector=my-sink,task=0,iskey=true

KIP-608

This proposal supersedes KIP-608 which only aimed at providing this functionality to Authorizer plugins. KIP-608 was adopted in 2020 but was never implemented. This KIP proposes to alter the names of the metrics from KIP-608 so all plugins use the same naming format:

Full NameTypeDescription
kafka.server:type=plugins,config=authorizer.class.name,class=MyAuthorizer,name=acls-total-countGaugeTotal ACLs created in the broker
kafka.server:type=plugins,config=authorizer.class.name,class=MyAuthorizer,name=authorization-request-rate-per-minuteRateTotal number of authorization requests per minute
kafka.server:type=plugins,config=authorizer.class.name,class=MyAuthorizer,name=authorization-allowed-rate-per-minuteRateTotal number of authorization allowed per minute
kafka.server:type=plugins,config=authorizer.class.name,class=MyAuthorizer,name=authorization-denied-rate-per-minuteRateTotal number of authorization denied per minute 

Supported plugins

The goal is allow this feature to be used by all plugins that are Closeable, AutoCloseable or have a close() methods. Also all Connect plugins will be supported. The javadoc of all supported plugins will be updated to mention they are able to implement the Monitorable interface to define their own metrics.

Common

- ConfigProvider: config.providers
- AuthenticateCallbackHandler: sasl.client.callback.handler.class, sasl.login.callback.handler.class, sasl.server.callback.handler.class
- Login: sasl.login.class
- SslEngineFactory: ssl.engine.factory.class

Broker

- KafkaPrincipalBuilder: principal.builder.class
- ReplicaSelector: replica.selector.class
- AlterConfigPolicy: alter.config.policy.class.name
- Authorizer: authorizer.class.name
- ClientQuotaCallback: client.quota.callback.class
- CreateTopicPolicy: create.topic.policy.class.name
- RemoteLogMetadataManager: remote.log.metadata.manager.class.name
- RemoteStorageManager: remote.log.storage.manager.class.name

Producer

- Serializer: key.serializer, value.serializer
- Partitioner: partitioner.class
- ProducerInterceptor: interceptor.classes

Consumer

- Deserializer: key.deserializer, value.deserializer
- ConsumerInterceptor: interceptor.classes

Connect

- ConnectorClientConfigOverridePolicy: connector.client.config.override.policy
- Converter: key.converter, value.converter
- HeaderConverter: header.converter
- ConnectRestExtension: rest.extension.classes
- Connector
- Task
- Transformation
- Predicate

Streams

The only Closeable interface Streams supports today is Serde. Users can instantiate their own Serdes instance so Streams does not control the lifecycle of instances. For that reason this proposal ignores Streams plugins.

Unsupported Plugins

The following plugins will not support this feature:

ClassConfigReason
KafkaMetricsReporterkafka.metrics.reportersThis interface is technically not part of the public API. Since MetricsReporter will not support it, it makes sense to not add it to this one either.
ConsumerPartitionAssignorpartition.assignment.strategyThis interface does not have a close() method. This interface is being deprecated by KIP-848, hence I don't propose updating it.
SecurityProviderCreatorsecurity.providersThis interface does not have a close() method. The instance is passed to java.security.Security and we don't control its lifecycle.
ReplicationPolicyreplication.policy.classMirrorMaker currently uses its own mechanism to emit metrics. This interface does not have a close() method.
ConfigPropertyFilterconfig.property.filter.classMirrorMaker currently uses its own mechanism to emit metrics.
GroupFiltergroup.filter.classMirrorMaker currently uses its own mechanism to emit metrics.
TopicFiltertopic.filter.classMirrorMaker currently uses its own mechanism to emit metrics.
ForwardingAdminforwarding.admin.classMirrorMaker currently uses its own mechanism to emit metrics.

If we decide that some of these plugins would benefit from being able to emit metrics, we could make the necessary changes in the future to support them.

Example usage

For example we can create a custom ProducerInterceptor that track how many records it intercepts:

public class MyInterceptor<K, V> implements ProducerInterceptor<K, V>, Monitorable {

    private Sensor sensor;

    public void withPluginMetrics(PluginMetrics metrics) {
        sensor = metrics.sensor("onSend");
        MetricName rate = metrics.metricName("rate", "Average number of calls per second.", Collections.emptyMap());
        MetricName total = metrics.metricName("total", "Total number of calls.", Collections.emptyMap());
        sensor.add(rate, new Rate());
        sensor.add(total, new CumulativeCount());
    }

    @Override
    public ProducerRecord<K, V> onSend(ProducerRecord<K, V> record) {
        sensor.record();
        return record;
    }
}

If the producer using this plugin has its client-id set to producer-1, the metrics created by this plugin will have the following name: kafka.producer:type=plugins,client-id=producer-1,config=interceptor.classes,class=MyInterceptor, and these attributes: rate and total.

Compatibility, Deprecation, and Migration Plan

This is a new feature so it has no impact on compatibility. The only significant API change is for Converter that will now be Closeable. The default close() method should ensure that existing implementations still work.

Test Plan

This feature will be tested using unit and integration tests. For each supported plugin, we will verify they can implement the withPluginMetrics() and that metrics have the correct tags associated.

Rejected Alternatives

  • Create a dedicated Metrics instance for plugins: A dedicated instance could have its own prefix/namespace (for example kafka.consumer.plugins). This would allow grouping metrics from all plugins but it requires instantiating another Metrics instance and new metrics reporters.
  • Let plugins create their own Metrics instance: Instead of passing the Metrics instance to plugins we could pass all the values necessary (metrics reporters, configs, etc ...) to create and configure a Metrics instance. This is impractical as it requires passing a lot of values around and plugins still have to have logic to use them.
  • Provide the Metrics instance to Kafka Connect Connectors and Tasks via their context: We would have 2 different mechanisms, one for Connectors/Tasks and one of all other plugins, for the same feature. Also using the Connector and Task contexts has an impact on compatibility.
  • Update MirrorMaker to use this new mechanism for creating its metrics. This will cause metrics to have different names. If needed this should be tackled in a separate KIP.
  • No labels