Overview

FEP-1 describes a method of implementing Flume NG (1.x.x) backward compatibility with Flume OG (0.9.x).

The goal of FEP-1 is to allow existing Flume installations running the 0.9.x branch (Flume OG) to interoperate with the 1.x.x branch (Flume NG). It describes a series of sources and sinks that act as compatibility bridges, mimicking the behavior of 0.9.x within 1.x.

Authors

E. Sammer - esammer@cloudera.com

Status

This is a Draft

Draft

We're still fighting about it, refining the FEP.

Proposed

We think this is what it should be. Let's vote.

Accepted

Everyone agrees this is the right way to do this. Let's code.

Rejected

Oops. People don't like this. Go back to drafting.

Abandoned

Oops. The author(s) / champions disappeared or gave up.

Description

With the release of Flume NG (1.x) users who currently run Flume OG (0.9.x) have started asking about backward compatibility. FEP-1 is about describing a method of integration between 0.9.x and 1.x.

First, this FEP makes a few assumptions.

  • No code in 0.9.x can be changed. If it could, users could simply upgrade to 1.x rather than redeploying an updated 0.9.x.
  • Integration can be constrained to the RPC mechanism. In other words, the way one connects 0.9.x agents to 1.x agents is via RPC.
  • The contents of 0.9.x write ahead logs will either be drained to a 1.x agent or discarded.

What's not compatible between 0.9.x and 1.x?

  • Events are represented differently and have a different interface
  • Source and sink APIs are different
  • The RPC mechanisms are different

The proposal is to create a 1.x source that mimics the 0.9.x RPC thrift server. This would allow existing 0.9.x clients to connect to 1.x agents if they were configured with the compatibility source. The event received by the compatibility source by the 1.x agent would have to be transformed into a 1.x format event although this isn't very different than receiving an event from syslog or any other 3rd party system.

With respect to 0.9.x's reliability modes, there are some caveats. When configured to use end-to-end reliability (0.9.x), events are normally retained until they reach the final destination. In the proposed scenario, 1.x wouldn't be able to provide the ACKs when the event was written. Instead, the event would immediately be ACKed. The expectation is that the 1.x agent would be configured with a durable channel implementation maintaining the delivery guarantee. This allows the 0.9.x agent to believe the event has been delivered and discard the event from its WAL.

Clients that speak 0.9.x's native thrift RPC protocol can, in theory, speak directly to the compatibility source. This should satisfy most, if not all, client cases.

Note that the inverse deployment loses some of the delivery guarantees and is purposefully not included in this proposal (because it's a bad idea). This would be the case where 1.x is sending events to a 0.9.x agent. The reason the delivery guarantee is lost is because 1.x requires that once it receives a successful ACK from the next hop in a delivery flow, it is permitted to release the event from its channel. This means that, unlike 0.9.x, an event may not make it to its final destination. All of that said, it would be trivial to implement a 1.x sink that speaks 0.9.x's RPC protocol.

References

TODO

  • No labels

1 Comment

  1. Simpler is the main goal of flume-ng,so there is no much need to
    consider the compatibility with flume0.9.x . If we do that,the simplication will make a big discount ,this is just my personal opinion .I support it.