An initial goal was to have a concise mechanism to allow users to write simple configurations or macros that would expand to a specification of low level policy decorators and sinks. It needed to support recursive chaining and tree-like fanout (for failover and fanout handlers). A declarative language with manipulable abstract syntax tree (AST) mapped well to these requirements and was chosen as the data structure to manage this and where transformations and macros would be handled. The generated AST's are used to instantiate all of the Sources and potentially long chains of Sinks.
The parse is broken into a few phases – lexing, parsing to AST, translating the AST, and then instantiation to a flume source or sink chain. While some prefer the concise syntax a more explicit syntax could be created as long as the generated AST is the same.
The core classes responsible for this include:
- FlumeBuilder (responsible for calling antlr generated code, and ast instantiation)
- FlumeSpecGen (responsible for converting ast's back to flume config language)
- FlumePatterns (ast tree-pattern matching library useful for ast transformations)
- SinkFactory / SinkFactoryImpl (calls the parsing functions to instantiate sinks and decos from strings. Main extension point for new sinks and decos)
- SoureFactory / SourceFactoryImpl (calls the parsing functions to instantiate sources from strings. Main extension point for new sinks and decos)
Here is the core antlr3-specific parser grammar for the v0.9.4 language. The main difference from standard grammars is the
-> ^(TOKEN arg1 arg2) constructs. These are antlr specific rewrite mechanisms that allow us to restructure concrete syntax trees to abstract syntax trees as the parse occurs. The whole file lives in
There are current a few points of "hackery" in the BNF.
- all sinks are wrapped with a DECO token ast node – thus we always get
(DECO (SINK foo ))instead of just
- The special collector construct and roll construct should probably consolidated