...
- Kafka Streams - the built-in stream processing library of the Apache Kafka project
- Kafka Streams Ecosystem:
- Complex Event Processing (CEP): https://github.com/fhussonnois/kafkastreams-cep.
- Fluent Kafka Streams Test: https://github.com/bakdata/fluent-kafka-streams-tests (blog post: https://medium.com/bakdata/fluent-kafka-streams-tests-e641785171ec)
- Storm - A stream-processing framework.
- Samza - A YARN-based stream processing framework.
- Storm Spout - Consume messages from Kafka and emit as Storm tuples
- Kafka-Storm - Kafka 0.8, Storm 0.9, Avro integration
- SparkStreaming - Kafka receiver supports Kafka 0.8 and above
- Flink - Apache Flink has an integration with Kafka
- IBM Streams - A stream processing framework with Kafka source and sink to consume and produce Kafka messages
- Spring Cloud Stream - a framework for building event-driven microservices, Spring Cloud Data Flow - a cloud-native orchestration service for Spring Cloud Stream applications
- Apache Apex - Stream processing framework with connectors for Kafka as source and sink.
- Logstash - Input and Output plugins to enrich events and optionally store in Elasticsearch
Hadoop Integration
- Confluent HDFS Connector - A sink connector for the Kafka Connect framework for writing data from Kafka to Hadoop HDFS
- Camus - LinkedIn's Kafka=>HDFS pipeline. This one is used for all data at LinkedIn, and works great.
- Kafka Hadoop Loader A different take on Hadoop loading functionality from what is included in the main distribution.
- Flume - Contains Kafka source (consumer) and sink (producer)
- KaBoom - A high-performance HDFS data loader
...
- Confluent JDBC Connector - A source connector for the Kafka Connect framework for writing data from RDBMS (e.g. MySQL) to Kafka
- Oracle Golden Gate Connector - Source connector that collects CDC operations via Golden Gate and writes them to Kafka
Search and Query
- ElasticSearch Elasticsearch - This project, Kafka Standalone Consumer will read the messages from Kafka, processes and index them in ElasticSearchElasticsearch. There are also several Kafka Connect connectors for ElasticSeachElasticsearch.
- Presto - The Presto Kafka connector allows you to query Kafka in SQL using Presto.
- Hive - Hive SerDe that allows querying Kafka (Avro only for now) using Hive SQL
...
- syslog (1M)
- syslog producer : A producer that supports both raw data and protobuf with meta data for deep analytics usage.
- syslog-ng (https://syslog-ng.org/) is one of the most widely used open source log collection tools, capable of filtering, classifying, parsing log data and forwarding it to a wide variety of destinations. Kafka is a first-class destination in the syslog-ng tool; details on the integration can be found at https://czanik.blogs.balabit.com/2015/11/kafka-and-syslog-ng/ .
- klogd - A python syslog publisher
- klogd2 - A java syslog publisher
- Tail2Kafka - A simple log tailing utility
- Fluentd plugin - Integration with Fluentd
- Remote log viewer
- LogStash Logstash integration - Integration with LogStash Logstash and Fluentd
- Syslog Collector written in Go
- Klogger - A simple proxy service for Kafka.
- fuse-kafka: A file system logging agent based on Kafka
- omkafka: Another syslog integration, this one in C and uses librdkafka library
- logkafka - Collect logs and send lines to Apache Kafka
- Filebeat Kafka Module - Collect and ship Kafka logs to Elasticsearch (docs)
Flume - Kafka plugins
- Flume Kafka Plugin - Integration with Flume
- Kafka as a sink and source in Flume - Integration with Flume
...
- Mozilla Metrics Service - A Kafka and Protocol Buffers based metrics and logging system
- Ganglia Integration
- SPM for Kafka
- Coda Hale Metric Reporter to Kafka
- kafka-dropwizard-reporter - Register built-in Kafka client and stream metrics to Dropwizard Metrics
- Metricbeat Kafka Module - Capture and ship Kafka consumergroup and partition metrics to Elasticsearch (docs)
Packing and Deployment
- RPM packaging
- Debian packaginghttps://github.com/tomdz/kafka-deb-packaging
- Puppet Integration
- Dropwizard packaging
...
- Kafka Websocket - A proxy that interoperates with websockets for delivering Kafka data to browsers.
- KafkaCat - A native, command line producer and consumer.
- Kafka Mirror - An alternative to the built-in mirroring tool
- Ruby Demo App
- Apache Camel Integration
- Infobright integration
- Riemann Consumer of Metrics
- stormkafkamom – curses-based tool which displays state of Apache Storm based Kafka consumers (Kafka 0.7 only).
- uReplicator - Provides the ability to replicate across Kafka clusters in other data centers
- Mirus - A tool for distributed, high-volume replication between Apache Kafka clusters based on Kafka Connect
- libbeat - All Elastic Beats (Metricbeat, Filebeat, etc) have Kafka outputs