Here is a list of tools we have been told about that integrate with Kafka outside the main distribution. We haven't tried them all, so they may not work!
Clients, of course, are listed separately here.
Distributions & Packaging
- Confluent Platform 1.0 - http://confluent.io/product/. Downloads - http://confluent.io/downloads/.
- Cloudera Kafka source https://github.com/cloudera-labs/kafka/tree/cdh5-0.8.2_1.1.0 and release http://www.cloudera.com/content/cloudera/en/developers/home/cloudera-labs/apache-kafka.html
- Hortonworks Kafka source ??? and release http://hortonworks.com/hadoop/kafka/
- Stratio Kafka source ??? for ubuntu http://repository.stratio.com/sds/1.1/ubuntu/13.10/binary/ and for RHEL http://repository.stratio.com/sds/1.1/RHEL/
- Storm - A stream-processing framework.
- Samza - A YARN-based stream processing framework.
- Storm Spout - Consume messages from Kafka and emit as Storm tuples
- Kafka-Storm - Kafka 0.8, Storm 0.9, Avro integration
- SparkStreaming - Kafka reciever supports Kafka 0.8 and above
- IBM Streams - A stream processing framework with Kafka source and sink to consume and produce Kafka messages
- Camus - LinkedIn's Kafka=>HDFS pipeline. This one is used for all data at LinkedIn, and works great.
- Kafka Hadoop Loader A different take on Hadoop loading functionality from what is included in the main distribution.
- Flume - Contains Kafka Source (consumer) and Sink (producer)
- KaBoom - A high-performance HDFS data loader
Search and Query
- ElasticSearch - This project, Kafka Standalone Consumer will read the messages from Kafka, processes and index them in ElasticSearch.
- Presto - The Presto Kafka connector allows you to query Kafka in SQL using Presto.
- Hive - Hive SerDe that allows querying Kafka (Avro only for now) using Hive SQL
- Kafka Manager - A tool for managing Apache Kafka.
- kafkat - Simplified command-line administration for Kafka brokers.
- Kafka Web Console- Displays information about your Kafka cluster including which nodes are up and what topics they host data for.
- Kafka Offset Monitor - Displays the state of all consumers and how far behind the head of the stream they are.
- Capillary – Displays the state and deltas of Kafka-based Apache Storm topologies. Supports Kafka >= 0.8. It also provides an API for fetching this information for monitoring purposes.
- Automated AWS deployment
- Kafka -> S3 Mirroring tool from Pinterest.
- Alternative Kafka->S3 Mirroring tool
- Syslog producer -
- A syslog producer that support both raw data and protobuf with meta data for deep analytics usage.
- syslog-ng (https://syslog-ng.org/) is one of the most widely used open source log collection tools, capable of filtering, classifying, parsing log data and forwarding it to a wide variety of destinations. Since its most recent release (3.7.1 https://github.com/balabit/syslog-ng/releases/tag/syslog-ng-3.7.1) support for delivering messages to kafka is added using the official Java client libraries and syslog-ng also provides a way to set the native kafka producer configuration file.
- klogd - A python syslog publisher
- klogd2 - A java syslog publisher
- Tail2Kafka - A simple log tailing utility
- Fluentd plugin - Integration with Fluentd
- Remote log viewer
- LogStash integration - Integration with LogStash and Fluentd
- Syslog Collector written in Go
- Klogger - A simple proxy service for Kafka.
- fuse-kafka: A file system logging agent based on Kafka
- omkafka: Another syslog integration, this one in C and uses librdkafka library
- logkafka - Collect logs and send lines to Apache Kafka
Flume - Kafka plugins
- Flume Kafka Plugin - Integration with Flume
- Kafka as a sink and source in Flume - Integration with Flume
- Mozilla Metrics Service - A Kafka and Protocol Buffers based metrics and logging system
- Ganglia Integration
- SPM for Kafka
- Coda Hale Metric Reporter to Kafka
Packing and Deployment
- RPM packaging
- Debian packaginghttps://github.com/tomdz/kafka-deb-packaging
- Puppet Integration
- Dropwizard packaging
Kafka Camel Integration
- Kafka Websocket - A proxy that interoperates with websockets for delivering Kafka data to browsers.
- KafkaCat - A native, command line producer and consumer.
- Kafka Mirror - An alternative to the built-in mirroring tool
- Ruby Demo App
- Apache Camel Integration
- Infobright integration
- Riemann Consumer of Metrics
- stormkafkamom – curses-based tool which displays state of Apache Storm based Kafka consumers (Kafka 0.7 only).