Apache Flume is a distributed, reliable, and available service for efficiently collecting, aggregating, and moving large amounts of log data. Its main goal is to deliver data from applications to Apache Hadoop's HDFS. It has a simple and flexible architecture based on streaming data flows. It is robust and fault tolerant with tunable reliability mechanisms and many failover and recovery mechanisms. It uses a simple extensible data model that allows for online analytic applications. Please click here for the user guide.
It is written primarily in Java and has been tested on unix-like systems: - Ubuntu 9.4+ (DEB compatible)
- Centos 5.3+ (RPM compatible)
- RHEL 5.5+
- SLES 11
- Mac OS X
|