Kafka runs on JVM, but no JVM exporter in kafka-ecosystems. i wrote one on Spring boot for working and very happy to share.
Status
Current state: Under Discussion
Discussion thread: here [Change the link from the KIP proposal email archive to your own email thread]
JIRA: here [Change the link from KAFKA-1 to your own ticket]
Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).
Motivation
kafka is an excellent MQ/Data Pipeline running on JVM, but no exporters JVMly. for a better future of Kafka-Ecosystems
the Apache needs a formal exporter like https://github.com/apache/rocketmq-exporter.
i wrote one for working, and hope to give to Apache. there are a lot of metric in JMX, it can be configed in the exporter-config.
Public Interfaces
How to config an Exporter?
common config
server:
port: 5650
spring:
application:
name: kafka-exporter
profiles:
active: dev
http:
encoding:
charset: UTF-8
enabled: true
force: true
logging:
config: classpath:logback.xml
task:
count: 8
brokerTopicMetrics:
cron: 1/15 * * * * ?
lagMetrics:
cron: 1/15 * * * * ?
jvmMetrics:
cron: 1/15 * * * * ?
replicaMetrics:
cron: 1/15 * * * 12 ?
networkMetrics:
cron: 1/15 * * * * ?
logFlushMetrics:
cron: 1/15 * * * * ?
kafkaControllerMetrics:
cron: 1/15 * * * 12 ?
kafkaClusterMetrics:
cron: 1/15 * * * 12 ?
kafka-exporter:
kafka-versions.0.10.2.0: 1 ## different kafka versions using different api versions
kafka-versions.0.10.1.1: 1 ## different kafka versions using different api versions
kafka-versions.1.0.0: 1
canSendToPaladin: true
## allowCollectMetrics and forbidCollectMetricNames for this yml's task config
allowCollectMetrics.brokerTopicMetrics:
- kafka.server:type=BrokerTopicMetrics,name=*
- kafka.server:type=BrokerTopicMetrics,name=*,topic=*
forbidCollectMetricNames.brokerTopicMetrics:
- FetchMessageConversionsPerSec
allowCollectMetrics.jvmMetrics:
- java.lang:type=GarbageCollector,name=*
- java.lang:type=Threading
forbidCollectMetricNames.jvmMetrics:
- Code Cache
allowCollectMetrics.replicaMetrics:
- kafka.server:type=ReplicaManager,name=*
forbidCollectMetricNames.replicaMetrics:
- aa
allowCollectMetrics.networkMetrics:
- kafka.network:type=RequestMetrics,name=*,request=*
- kafka.network:type=RequestMetrics,name=*,request=*,version=* # for 2.0.0
- kafka.network:type=SocketServer,name=*
- kafka.network:type=RequestChannel,name=*
- kafka.server:type=KafkaRequestHandlerPool,name=*
forbidCollectMetricNames.networkMetrics:
- MessageConversionsTimeMs # normally , use metric name
- TemporaryMemoryBytes
- MessageConversionsTimeMs
- ThrottleTimeMs
- TotalTimeMs
- LocalTimeMs
- RemoteTimeMs
- RequestBytes
- ResponseQueueTimeMs
- ResponseSendTimeMs
forbidCollectMetricNames.RequestMetrics:
- AlterConfigs
- AlterReplicaLogDirs
- ApiVersions
- ControlledShutdown
- CreateAcls
- CreateDelegationToken
- DeleteAcls
- DeleteRecords
- DescribeAcls
- DescribeConfigs
- DescribeDelegationToken
- DescribeLogDirs
- EndTxn
- ExpireDelegationToken
- InitProducerId
- OffsetForLeaderEpoch
- RenewDelegationToken
- SaslAuthenticate
- SaslHandshake
- StopReplica
- TxnOffsetCommit
- WriteTxnMarkers
- AddOffsetsToTxn
allowCollectMetrics.logFlushMetrics:
- kafka.log:type=LogFlushStats,name=LogFlushRateAndTimeMs
- kafka.log:type=LogCleanerManager,name=*
forbidCollectMetricNames.logFlushMetrics:
- aa
allowCollectMetrics.kafkaControllerMetrics:
- kafka.controller:type=KafkaController,name=*
forbidCollectMetricNames.kafkaControllerMetrics:
- aa
allowCollectMetrics.kafkaClusterMetrics:
- kafka.cluster:type=Partition,name=*,topic=*,partition=*
forbidCollectMetricNames.kafkaClusterMetrics:
- aa
jmx-excludes-metrics.brokerTopicMetrics:
- aa
jmx-excludes-attrs.BrokerTopicMetrics:
- aa
jmx-excludes-attrs-global:
- EventType
- RateUnit
- LatencyUnit
- 50thPercentile
- 75thPercentile
- 98thPercentile
- LastGcInfo
- MemoryPoolNames
- ObjectName
- Valid
- Name
- ThreadAllocatedMemoryEnabled
- ThreadAllocatedMemorySupported
- ThreadContentionMonitoringEnabled
- AllThreadIds
- ThreadCpuTimeSupported
- ThreadCpuTimeEnabled
- ThreadContentionMonitoringSupported
- CurrentThreadCpuTimeSupported
- ObjectMonitorUsageSupported
- SynchronizerUsageSupported
kafka clusters' you want to monitor
kafka-exporter:
zookeepers:
- cluster-name: cluster-name-of-your-kafka-brokers ##cluster name
zk-ip-and-port: 127.0.0.1:2181,127.0.0.2:2181 ## zookeeper addresses
zk-kafka-path: /kafka ## zookeeper namespace
excludes-topics.BrokerTopicMetrics:
- aaa
- bbb
- beexiao(.*?)
jmx-excludes-metrics.BrokerTopicMetrics:
- aa
- bb
jmx-excludes-metrics.RequestMetrics:
- AlterConfigs
- AlterReplicaLogDirs
- ApiVersions
- ControlledShutdown
- CreateAcls
- CreateDelegationToken
- DeleteAcls
- DeleteRecords
- DescribeAcls
- DescribeConfigs
- DescribeDelegationToken
- DescribeLogDirs
- EndTxn
- ExpireDelegationToken
- InitProducerId
- OffsetForLeaderEpoch
- RenewDelegationToken
- SaslAuthenticate
- SaslHandshake
- StopReplica
- TxnOffsetCommit
- WriteTxnMarkers
- AddOffsetsToTxn
jmx-excludes-attrs.BrokerTopicMetrics:
- EventType
- RateUnit
jmx-excludes-attrs.GarbageCollector:
- LastGcInfo
- MemoryPoolNames
- ObjectName
- Valid
- Name
jmx-excludes-attrs.ReplicaManager:
- EventType
- RateUnit
jmx-excludes-attrs.RequestMetrics:
- EventType
- RateUnit
- FifteenMinuteRate
- FiveMinuteRate
- 75thPercentile
- 98thPercentile
jmx-excludes-attrs.LogFlushRateAndTimeMs:
- LatencyUnit
- RateUnit
- EventType
- FifteenMinuteRate
- 50thPercentile
- 75thPercentile
- 98thPercentile
metric names for now
kafka_BrokerTopicMetrics_BytesInPerSec_Count
kafka_BrokerTopicMetrics_BytesInPerSec_FifteenMinuteRate
kafka_BrokerTopicMetrics_BytesInPerSec_FiveMinuteRate
kafka_BrokerTopicMetrics_BytesInPerSec_MeanRate
kafka_BrokerTopicMetrics_BytesInPerSec_OneMinuteRate
kafka_BrokerTopicMetrics_BytesOutPerSec_Count
kafka_BrokerTopicMetrics_BytesOutPerSec_FifteenMinuteRate
kafka_BrokerTopicMetrics_BytesOutPerSec_FiveMinuteRate
kafka_BrokerTopicMetrics_BytesOutPerSec_MeanRate
kafka_BrokerTopicMetrics_BytesOutPerSec_OneMinuteRate
kafka_BrokerTopicMetrics_BytesRejectedPerSec_Count
kafka_BrokerTopicMetrics_BytesRejectedPerSec_FifteenMinuteRate
kafka_BrokerTopicMetrics_BytesRejectedPerSec_FiveMinuteRate
kafka_BrokerTopicMetrics_BytesRejectedPerSec_MeanRate
kafka_BrokerTopicMetrics_BytesRejectedPerSec_OneMinuteRate
kafka_BrokerTopicMetrics_FailedFetchRequestsPerSec_Count
kafka_BrokerTopicMetrics_FailedFetchRequestsPerSec_FifteenMinuteRate
kafka_BrokerTopicMetrics_FailedFetchRequestsPerSec_FiveMinuteRate
kafka_BrokerTopicMetrics_FailedFetchRequestsPerSec_MeanRate
kafka_BrokerTopicMetrics_FailedFetchRequestsPerSec_OneMinuteRate
kafka_BrokerTopicMetrics_FailedProduceRequestsPerSec_Count
kafka_BrokerTopicMetrics_FailedProduceRequestsPerSec_FifteenMinuteRate
kafka_BrokerTopicMetrics_FailedProduceRequestsPerSec_FiveMinuteRate
kafka_BrokerTopicMetrics_FailedProduceRequestsPerSec_MeanRate
kafka_BrokerTopicMetrics_FailedProduceRequestsPerSec_OneMinuteRate
kafka_BrokerTopicMetrics_MessagesInPerSec_Count
kafka_BrokerTopicMetrics_MessagesInPerSec_FifteenMinuteRate
kafka_BrokerTopicMetrics_MessagesInPerSec_FiveMinuteRate
kafka_BrokerTopicMetrics_MessagesInPerSec_MeanRate
kafka_BrokerTopicMetrics_MessagesInPerSec_OneMinuteRate
kafka_BrokerTopicMetrics_ProduceMessageConversionsPerSec_Count
kafka_BrokerTopicMetrics_ProduceMessageConversionsPerSec_FifteenMinuteRate
kafka_BrokerTopicMetrics_ProduceMessageConversionsPerSec_FiveMinuteRate
kafka_BrokerTopicMetrics_ProduceMessageConversionsPerSec_MeanRate
kafka_BrokerTopicMetrics_ProduceMessageConversionsPerSec_OneMinuteRate
kafka_BrokerTopicMetrics_ReplicationBytesInPerSec_Count
kafka_BrokerTopicMetrics_ReplicationBytesInPerSec_FifteenMinuteRate
kafka_BrokerTopicMetrics_ReplicationBytesInPerSec_FiveMinuteRate
kafka_BrokerTopicMetrics_ReplicationBytesInPerSec_MeanRate
kafka_BrokerTopicMetrics_ReplicationBytesInPerSec_OneMinuteRate
kafka_BrokerTopicMetrics_ReplicationBytesOutPerSec_Count
kafka_BrokerTopicMetrics_ReplicationBytesOutPerSec_FifteenMinuteRate
kafka_BrokerTopicMetrics_ReplicationBytesOutPerSec_FiveMinuteRate
kafka_BrokerTopicMetrics_ReplicationBytesOutPerSec_MeanRate
kafka_BrokerTopicMetrics_ReplicationBytesOutPerSec_OneMinuteRate
kafka_BrokerTopicMetrics_TotalFetchRequestsPerSec_Count
kafka_BrokerTopicMetrics_TotalFetchRequestsPerSec_FifteenMinuteRate
kafka_BrokerTopicMetrics_TotalFetchRequestsPerSec_FiveMinuteRate
kafka_BrokerTopicMetrics_TotalFetchRequestsPerSec_MeanRate
kafka_BrokerTopicMetrics_TotalFetchRequestsPerSec_OneMinuteRate
kafka_BrokerTopicMetrics_TotalProduceRequestsPerSec_Count
kafka_BrokerTopicMetrics_TotalProduceRequestsPerSec_FifteenMinuteRate
kafka_BrokerTopicMetrics_TotalProduceRequestsPerSec_FiveMinuteRate
kafka_BrokerTopicMetrics_TotalProduceRequestsPerSec_MeanRate
kafka_BrokerTopicMetrics_TotalProduceRequestsPerSec_OneMinuteRate
kafka_GarbageCollector_G1_Old_Generation_CollectionCount
kafka_GarbageCollector_G1_Old_Generation_CollectionTime
kafka_GarbageCollector_G1_Young_Generation_CollectionCount
kafka_GarbageCollector_G1_Young_Generation_CollectionTime
kafka_KafkaController_ActiveControllerCount_Value
kafka_KafkaController_ControllerState_Value
kafka_KafkaController_GlobalPartitionCount_Value
kafka_KafkaController_GlobalTopicCount_Value
kafka_KafkaController_OfflinePartitionsCount_Value
kafka_KafkaController_PreferredReplicaImbalanceCount_Value
kafka_KafkaRequestHandlerPool_RequestHandlerAvgIdlePercent_Count
kafka_KafkaRequestHandlerPool_RequestHandlerAvgIdlePercent_FifteenMinuteRate
kafka_KafkaRequestHandlerPool_RequestHandlerAvgIdlePercent_FiveMinuteRate
kafka_KafkaRequestHandlerPool_RequestHandlerAvgIdlePercent_MeanRate
kafka_KafkaRequestHandlerPool_RequestHandlerAvgIdlePercent_OneMinuteRate
kafka_LogCleanerManager_max_dirty_percent_Value
kafka_LogCleanerManager_time_since_last_run_ms_Value
kafka_LogFlushStats_LogFlushRateAndTimeMs_95thPercentile
kafka_LogFlushStats_LogFlushRateAndTimeMs_999thPercentile
kafka_LogFlushStats_LogFlushRateAndTimeMs_99thPercentile
kafka_LogFlushStats_LogFlushRateAndTimeMs_Count
kafka_LogFlushStats_LogFlushRateAndTimeMs_FifteenMinuteRate
kafka_LogFlushStats_LogFlushRateAndTimeMs_FiveMinuteRate
kafka_LogFlushStats_LogFlushRateAndTimeMs_Max
kafka_LogFlushStats_LogFlushRateAndTimeMs_Mean
kafka_LogFlushStats_LogFlushRateAndTimeMs_MeanRate
kafka_LogFlushStats_LogFlushRateAndTimeMs_Min
kafka_LogFlushStats_LogFlushRateAndTimeMs_OneMinuteRate
kafka_LogFlushStats_LogFlushRateAndTimeMs_StdDev
kafka_Partition_InSyncReplicasCount_Value
kafka_Partition_LastStableOffsetLag_Value
kafka_Partition_ReplicasCount_Value
kafka_Partition_UnderMinIsr_Value
kafka_Partition_UnderReplicated_Value
kafka_ReplicaManager_FailedIsrUpdatesPerSec_Count
kafka_ReplicaManager_FailedIsrUpdatesPerSec_FifteenMinuteRate
kafka_ReplicaManager_FailedIsrUpdatesPerSec_FiveMinuteRate
kafka_ReplicaManager_FailedIsrUpdatesPerSec_MeanRate
kafka_ReplicaManager_FailedIsrUpdatesPerSec_OneMinuteRate
kafka_ReplicaManager_IsrExpandsPerSec_Count
kafka_ReplicaManager_IsrExpandsPerSec_FifteenMinuteRate
kafka_ReplicaManager_IsrExpandsPerSec_FiveMinuteRate
kafka_ReplicaManager_IsrExpandsPerSec_MeanRate
kafka_ReplicaManager_IsrExpandsPerSec_OneMinuteRate
kafka_ReplicaManager_IsrShrinksPerSec_Count
kafka_ReplicaManager_IsrShrinksPerSec_FifteenMinuteRate
kafka_ReplicaManager_IsrShrinksPerSec_FiveMinuteRate
kafka_ReplicaManager_IsrShrinksPerSec_MeanRate
kafka_ReplicaManager_IsrShrinksPerSec_OneMinuteRate
kafka_ReplicaManager_LeaderCount_Value
kafka_ReplicaManager_OfflineReplicaCount_Value
kafka_ReplicaManager_PartitionCount_Value
kafka_ReplicaManager_UnderMinIsrPartitionCount_Value
kafka_ReplicaManager_UnderReplicatedPartitions_Value
kafka_RequestChannel_RequestQueueSize_Value
kafka_RequestChannel_ResponseQueueSize_Value
kafka_RequestMetrics_RequestQueueTimeMs_95thPercentile
kafka_RequestMetrics_RequestQueueTimeMs_999thPercentile
kafka_RequestMetrics_RequestQueueTimeMs_99thPercentile
kafka_RequestMetrics_RequestQueueTimeMs_Count
kafka_RequestMetrics_RequestQueueTimeMs_Max
kafka_RequestMetrics_RequestQueueTimeMs_Mean
kafka_RequestMetrics_RequestQueueTimeMs_Min
kafka_RequestMetrics_RequestQueueTimeMs_StdDev
kafka_RequestMetrics_RequestsPerSec_Count
kafka_RequestMetrics_RequestsPerSec_FifteenMinuteRate
kafka_RequestMetrics_RequestsPerSec_FiveMinuteRate
kafka_RequestMetrics_RequestsPerSec_MeanRate
kafka_RequestMetrics_RequestsPerSec_OneMinuteRate
kafka_SocketServer_MemoryPoolAvailable_Value
kafka_SocketServer_MemoryPoolUsed_Value
kafka_SocketServer_NetworkProcessorAvgIdlePercent_Value
kafka_Threading_CurrentThreadCpuTime
kafka_Threading_CurrentThreadUserTime
kafka_Threading_DaemonThreadCount
kafka_Threading_PeakThreadCount
kafka_Threading_ThreadCount
kafka_Threading_TotalStartedThreadCount
kafka_consumer_lag
kafka_topic_partitions
Proposed Changes
build a whole new kafka-exporter RUN-ON-JVM for kafka
Compatibility, Deprecation, and Migration Plan
- What impact (if any) will there be on existing users?Answer: we can monitor our kafka clusters more easily using prometheus exporter in java, and many many metrics you want.
- If we are changing behavior how will we phase out the older behavior?Answer: prometheus is a very good monitor for midwares like kafka, or maybe your ops has already use it.
- If we need special migration tools, describe them here.Answer: some prometheus servers, and prometheus alert manager
- When will we remove the existing behavior?Answer: all exporters run stable and you can view all metrics in some UI(like grafana)

Rejected Alternatives
let's do this!
