Apache Flume is a distributed, reliable, and available service for efficiently collecting, aggregating, and moving large amounts of log data. Its main goal is to deliver data from applications to Apache Hadoop's HDFS. It has a simple and flexible architecture based on streaming data flows. It is robust and fault tolerant with tunable reliability mechanisms and many failover and recovery mechanisms. It uses a simple extensible data model that allows for online analytic applications. Please click here for the user guide.

It is written primarily in Java and has been tested on unix-like systems:

  • Ubuntu 9.4+ (DEB compatible)
  • Centos 5.3+ (RPM compatible)
  • RHEL 5.5+
  • SLES 11
  • Mac OS X






