Page History

...

Kafka Streams - the built-in stream processing library of the Apache Kafka project
Kafka Streams Ecosystem:
- Complex Event Processing (CEP): https://github.com/fhussonnois/kafkastreams-cep.
- Fluent Kafka Streams Test: https://github.com/bakdata/fluent-kafka-streams-tests (blog post: https://medium.com/bakdata/fluent-kafka-streams-tests-e641785171ec)
Storm - A stream-processing framework.
Samza - A YARN-based stream processing framework.
Storm Spout - Consume messages from Kafka and emit as Storm tuples
Kafka-Storm - Kafka 0.8, Storm 0.9, Avro integration
SparkStreaming - Kafka receiver supports Kafka 0.8 and above
Flink - Apache Flink has an integration with Kafka
IBM Streams - A stream processing framework with Kafka source and sink to consume and produce Kafka messages
Spring Cloud Stream - a framework for building event-driven microservices, Spring Cloud Data Flow - a cloud-native orchestration service for Spring Cloud Stream applications
Apache Apex - Stream processing framework with connectors for Kafka as source and sink.
Logstash - Input and Output plugins to enrich events and optionally store in Elasticsearch

Confluent HDFS Connector - A sink connector for the Kafka Connect framework for writing data from Kafka to Hadoop HDFS
Camus - LinkedIn's Kafka=>HDFS pipeline. This one is used for all data at LinkedIn, and works great.
Kafka Hadoop Loader A different take on Hadoop loading functionality from what is included in the main distribution.
Flume - Contains Kafka source (consumer) and sink (producer)
KaBoom - A high-performance HDFS data loader

...

Confluent JDBC Connector - A source connector for the Kafka Connect framework for writing data from RDBMS (e.g. MySQL) to Kafka
Oracle Golden Gate Connector - Source connector that collects CDC operations via Golden Gate and writes them to Kafka

ElasticSearch Elasticsearch - This project, Kafka Standalone Consumer will read the messages from Kafka, processes and index them in ElasticSearchElasticsearch. There are also several Kafka Connect connectors for ElasticSeachElasticsearch.
Presto - The Presto Kafka connector allows you to query Kafka in SQL using Presto.
Hive - Hive SerDe that allows querying Kafka (Avro only for now) using Hive SQL

...

syslog (1M)
- syslog producer : A producer that supports both raw data and protobuf with meta data for deep analytics usage.
- syslog-ng (https://syslog-ng.org/) is one of the most widely used open source log collection tools, capable of filtering, classifying, parsing log data and forwarding it to a wide variety of destinations. Kafka is a first-class destination in the syslog-ng tool; details on the integration can be found at https://czanik.blogs.balabit.com/2015/11/kafka-and-syslog-ng/ .
klogd - A python syslog publisher
klogd2 - A java syslog publisher
Tail2Kafka - A simple log tailing utility
Fluentd plugin - Integration with Fluentd
Remote log viewer
LogStash Logstash integration - Integration with LogStash Logstash and Fluentd
Syslog Collector written in Go
Klogger - A simple proxy service for Kafka.
fuse-kafka: A file system logging agent based on Kafka
omkafka: Another syslog integration, this one in C and uses librdkafka library
logkafka - Collect logs and send lines to Apache Kafka
Filebeat Kafka Module - Collect and ship Kafka logs to Elasticsearch (docs)

...

Mozilla Metrics Service - A Kafka and Protocol Buffers based metrics and logging system
Ganglia Integration
SPM for Kafka
Coda Hale Metric Reporter to Kafka
kafka-dropwizard-reporter - Register built-in Kafka client and stream metrics to Dropwizard Metrics
Metricbeat Kafka Module - Capture and ship Kafka consumergroup and partition metrics to Elasticsearch (docs)

...

Kafka Websocket - A proxy that interoperates with websockets for delivering Kafka data to browsers.
KafkaCat - A native, command line producer and consumer.
Kafka Mirror - An alternative to the built-in mirroring tool
Ruby Demo App
Apache Camel Integration
Infobright integration
Riemann Consumer of Metrics
stormkafkamom – curses-based tool which displays state of Apache Storm based Kafka consumers (Kafka 0.7 only).
uReplicator - Provides the ability to replicate across Kafka clusters in other data centers
Mirus - A tool for distributed, high-volume replication between Apache Kafka clusters based on Kafka Connect
libbeat - All Elastic Beats (Metricbeat, Filebeat, etc) have Kafka outputs

Space shortcuts