Versions Compared


  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Migrated to Confluence 5.3


From Thrift

Avro's specific implementation can be very similar to thrift's IDL / generation paradigm. For an example of project that simultaneously supports both Java and Thrift, see Flume which has both Avro and Thrift Sources.

Source files

Here we see an avro IDL file and a thrift source file that represent the same RPC service.

Code Block

protocol FlumeReportAvroServer{
 enum FlumeNodeState { HELLO, IDLE, CONFIGURING,
 record AvroFlumeConfigData {
    long timestamp;
    string sourceConfig;
    string sinkConfig;
    long sourceVersion;
    long sinkVersion;
    string flowID;
 record FlumeReportAvro {
    map<string> stringMetrics;
    map<long> longMetrics;
    map<double> doubleMetrics;

 boolean heartbeat(string logicalNode, string physicalNode, string host,
    FlumeNodeState s, long timestamp);
 union { AvroFlumeConfigData, null } getConfig(string physNode);
 array<string> getLogicalNodes(string physNode);
 void acknowledge(string ackid);
 boolean checkAck(string ackid);
 void putReports(map<FlumeReportAvro> reports);
Code Block
namespace java com.cloudera.flume.conf.thrift

typedef i64 Timestamp

 enum FlumeNodeState {
    HELLO =0,
    IDLE =1,
    ERROR =4,

 struct ThriftFlumeConfigData {
    1: Timestamp timestamp,
    2: string sourceConfig,
    3: string sinkConfig,
    4: i64 sourceVersion,
    5: i64 sinkVersion,
    6: string flowID

 struct FlumeReport {
    3: map<string, string> stringMetrics,
    4: map<string, i64> longMetrics,
    5: map<string, double> doubleMetrics

service FlumeClientServer {
 bool heartbeat(1:string logicalNode, 4:string physicalNode, 5:string host, 2:FlumeNodeState s, 3:i64 timestamp),
 ThriftFlumeConfigData getConfig(1:string sourceId),
 list<string> getLogicalNodes(1: string physNode),
 void acknowledge(1:string ackid),
 bool checkAck(1:string ackid),
 void putReports(1:map<string, FlumeReport> reports)

Two key differences are that fields are not numbered in Avro in the same way they are numbered in Thrift and that maps are only allowed to have String keys, so they just require one parameter when defined.

Building clients and servers

On the client side, Avro generates a client class against which you can make RPC calls. This example shows how to instantiate a client that request over HTTP transport.

Code Block
  URL url = new URL("http", SERVER_HOST, SERVER_PORT, "/");
  trans = new HttpTransceiver(url);
  FlumeReportAvroServer masterClient = (FlumeReportAvroServer) 
    SpecificRequestor.getClient(FlumeReportAvroServer.class, trans);

This is a similar, but not exactly equivalent thrift code segment:

Code Block
  TTransport masterTransport = new TSocket(SERVER_HOST, SERVER_PORT);
  TProtocol protocol = new TBinaryProtocol(masterTransport);;
  FlumeClientServer.Iface masterClient = new Client(protocol);

On the server side, Avro creates an interface (similar to Thrift) that your server must implement. This will contain the method signatures from your IDL file.

Code Block
public class MasterClientServerAvro implements
  FlumeReportAvroServer {

In thrift we have:

Code Block
public class MasterClientServerThrift extends ThriftServer implements
    FlumeClientServer.Iface {
Nonstandard data types

Avro has some quirky data-types that will cause hiccups if directly copying your thrift code.

Utf8 In older versions of Avro's (1.3.3 and earlier), function signatures that involve strings use Utf8() not String(). Your client and server implementations will expect to pass and receive Utf8() instances, so you will need to translate this type to and from String on your own.

Arrays If your function accepts or returns an Array type, you cannot simply pass a Java array. Instead, it will expect an implementing class of Avro's own GenericArray. To create an array of Strings, for example, use

Code Block
GenericArray<Utf8> out = new GenericData.Array<Utf8>(
        str.size(), Schema.createArray(Schema.create(Type.STRING)));