This page will try to highlight different features and their maturity based on stories gathered from the community.

States are

  • (tick) Recommended - feature is known to be in daily use in multiple places, and recommended for use. The API is stable.
  • (warning) Beta - feature is usable, known to be in use at one or to places, but may have some issues or limitations. The API is close to stable.
  • (question) Experimental/Development - feature is in development and will likely have issues. API may change significantly. Be prepared to go into code.
  • (info) Testing - for debugging and testing purposes only.

v0.9.4 (pre-apache release)

Flume Master features

Feature

Status

Notes

Master Mode

 

Single Master ZK-backed mode

(tick)

Default. Recommended mode of use

Single Master Memory-backed mode

(info)

Useful for development and debugging

Multi-Master ZK-backed mode

(question)

This is being used by a handful of users but has known limitations. (e.g. no support for auto-agent collector mapping

Automatic configurations

 

Automatic agent-collector mapping

(question)

Must use Single Master mode. All logical nodes must be up before this feature will work. May have problems with reconfiguration or if configuration written before logical nodes report

Automatic flow isolation

(question)

Must use Single Master mode. Dependent upon Automatic agent-collector mapping

Metrics

 

Master JSON Metrics

(warning)

Looking for feedback and nagios/munin/etc integration stories.

Flume Node features

Client

Status

Notes

Linux clients

(tick)

Core development and most users are here

Windows client

(question)

This is fairly early

Feature

Status

Notes

Agents and Collectors

 

agentBESink

(tick)

 

agentDFOSink

(tick)

 

agentE2ESink / agentSink

(tick)

 

collectorSource

(tick)

 

collectorSink("hdfs://...",..)

(tick)

HDFS is the recommended target file system.

collectorSink("s3n://...",..)

(tick)*

In use but has known issue due to how s3 files are "closed"

collectorSink("file:///...",..)

(info)

Intended for testing but could be used in production

agentBEChain

(warning)

Need known production use stories

agentDFOChain

(warning)

Need known production use stories

agentE2EChain

(warning)

Need known production use stories

auto*Chain

(question)

Limitation on at the master, to be address in future versions

Sources

 

thriftSource

(tick)

Default RPC. Recommended mode of use

avroSource

(warning)

Need known production use stories

syslogTCP source

(tick)

 

tail/multiTail/tailDir

(tick)*

This is known to have some duplication issues but has been used in production settings. If this is encountered, known workarounds include using exec source, or having applications write to flume via RPC

exec source

(tick)

 

log4j appender sources

 

 

text

(info)

 

Sinks

 

thriftSink

(tick)

Default RPC. Recommended mode of use

avroSink

(warning)

Need known production use stories

attr2hbase, hbase sink plugin

(warning)

Known to be in use in a few places

seqfile sink

(info)

Used internally (E2E mode)

dfs sink

(info)

Used internally

Metrics

 

Node JSON metrics

(question)

Looking for feedback and nagios/munin/etc integration stories. This api is likely to change as we receive feedback

  • No labels