Table of Contents |
---|
Overview
FEP-1 describes a method of implementing Flume NG (1.x.x) backward compatibility with Flume OG (0.9.x).
The goal of FEP-1 is to allow existing Flume installations running the 0.9.x branch (Flume OG) to interoperate with the 1.x.x branch (Flume NG). It describes a series of sources and sinks that act as compatibility bridges, mimicking the behavior of 0.9.x within 1.x.
Authors
E. Sammer - esammer@cloudera.com
Status
This is a Draft
Draft | We're still fighting about it, refining the FEP. |
---|---|
Proposed | We think this is what it should be. Let's vote. |
Accepted | Everyone agrees this is the right way to do this. Let's code. |
Rejected | Oops. People don't like this. Go back to drafting. |
Abandoned | Oops. The author(s) / champions disappeared or gave up. |
Description
With the release of Flume NG (1.x) users who currently run Flume OG (0.9.x) have started asking about backward compatibility. FEP-1 is about describing a method of integration between 0.9.x and 1.x.
First, this FEP makes a few assumptions.
- No code in 0.9.x can be changed. If it could, users could simply upgrade to 1.x rather than redeploying an updated 0.9.x.
- Integration can be constrained to the RPC mechanism. In other words, the way one connects 0.9.x agents to 1.x agents is via RPC.
- The contents of 0.9.x write ahead logs will either be drained to a 1.x agent or discarded.
What's not compatible between 0.9.x and 1.x?
- Events are represented differently and have a different interface
- Source and sink APIs are different
- The RPC mechanisms are different
The proposal is to create a 1.x source that mimics the 0.9.x RPC thrift server. This would allow existing 0.9.x clients to connect to 1.x agents if they were configured with the compatibility source. The event received by the compatibility source by the 1.x agent would have to be transformed into a 1.x format event although this isn't very different than receiving an event from syslog or any other 3rd party system.
With respect to 0.9.x's reliability modes, there are some caveats. When configured to use end-to-end reliability (0.9.x), events are normally retained until they reach the final destination. In the proposed scenario, 1.x wouldn't be able to provide the ACKs when the event was written. Instead, the event would immediately be ACKed. The expectation is that the 1.x agent would be configured with a durable channel implementation maintaining the delivery guarantee. This allows the 0.9.x agent to believe the event has been delivered and discard the event from its WAL.
Clients that speak 0.9.x's native thrift RPC protocol can, in theory, speak directly to the compatibility source. This should satisfy most, if not all, client cases.
Note that the inverse deployment loses some of the delivery guarantees and is purposefully not included in this proposal (because it's a bad idea). This would be the case where 1.x is sending events to a 0.9.x agent. The reason the delivery guarantee is lost is because 1.x requires that once it receives a successful ACK from the next hop in a delivery flow, it is permitted to release the event from its channel. This means that, unlike 0.9.x, an event may not make it to its final destination. All of that said, it would be trivial to implement a 1.x sink that speaks 0.9.x's RPC protocol.
References
TODO