Skip to end of metadata
Go to start of metadata

Introduction

Apache Geode is a data management platform that provides real-time, consistent access to data-intensive applications throughout widely distributed cloud architectures. While it currently has high-speed client interfaces for Java, C++ and .NET there is both a need to create lighter-weight clients and a demand to access Geode from other programming languages. Unfortunately, the existing client-server protocol is undocumented.  It evolved over time and is overly complex to meet either of these needs.

This proposal details the requirements, API and structure for a new client/server protocol. It does not specify the exact serialization mechanism, but the intent is for the protocol that is described to be complete in terms of the interface and message ordering. The intent is to make it pluggable so that we can experiment with different serialization formats based on varying performance and ease-of-use needs. In particular, we expect to use a widely-available IDL to serialize the protocol at first and make it accessible from many languages, and possibly implement a custom serialization later for clients needing very high performance. Choosing the IDL is an open goal.

The intent is to allow client functionality to be implemented in phases, moving from a "basic client" to a more advanced "smart client".  It endeavors to provide a protocol that is also more amenable to more modern APIs such as those using asynchronous or reactive patterns.

Serialization of application keys, values, callback arguments, function parameters and so forth is a separate matter and are not necessarily tied to the serialization protocol used for client/server messaging.  The initial protocol will support primitive types such as scalars, strings, and byte arrays.  It will also support JSON documents as values and convert between these and Geode PDX-serialized objects in the servers.

Goals

The high-level goals for the protocol are defined here.


Protocol Requirements

In the evaluation or definition of any protocol, it expected that the evaluated protocol/framework meets the following requirements:

  • Versioning: The protocol has to provide version information, in order to distinguish different protocol versions from one another.

  • Correlation Id: This number is a unique identifier that allows the system to correlate requests and responses.

  • Object Type: The serialization type of the data objects stored inside the messages

  • Response Type: It indicates whether a response is partial or complete.

  • ErrorCodes: It indicates the problem with API invocation.

  • Chunk Response: The ability to send a large response in multiple smaller, more manageable chunks.

  • Continuous Response: Client can register(Observer pattern) for events and then server notify the client if those events occur.

  • Request: The request message to be sent

  • Response: The response message received in relation to a request message

  • Request Format: Format of request API and its parameters, which client wants to invoke.

  • Response Format: Format for API return value, which client invoked.

  • Message: The generic construct that represents a message that is to be sent which contains a Message Header and Request/Response.

  • Serialized Byte Order: Big Endian

RPC and Message serialization Frameworks

 During the investigation into frameworks to help "lower the barrier of entry," it became evident that there are two types of external frameworks:

  1. Message Serialization Frameworks - These frameworks allow for the definition of a message in a generic IDL, the generation of language specific classes from the IDL, and the encoding/decoding of those message to be sent over a transport
  2. RPC Frameworks - There frameworks provide greater coverage in the node-to-node communication: the transport layer (HTTP, TCP, UDP), the message definition in IDL with corresponding serialization mechanism and the definition of methods in the IDL, as well as the generation of corresponding service stubs.

The differences between the two approaches are:
  1. Message serialization frameworks define the encoding/decoding of defined messages but not the transport or connectivity.
  2. RPC frameworks concern themselves with connectivity and transport, remote method invocations and the encoding/decoding of defined messages.

Because this protocol needs to be tunable for very high performance, for some lack of functionality and because RPC frameworks hide their network and threading internals, it was decided that option 2 was not viable. See a comparison at RPC framework evaluation.

From an higher-level architectural perspective we can identify 2 layers:

  1. A Transport Layer (TCP, UDP, HTTP, etc..)
  2. Message encoding/decoding Layer

This proposal will define the message structure and protocol to be agnostic of transport used.

Message Structure Definition

All details relating to the Message structure definition can be found on the page Message Structure and Definition.

Proposed Implementation Phases

Introducing a new protocol into GEODE has the potential to be highly disruptive. In order to minimize the disruption and maximize the feedback cycles, we suggest implementing the changes in a phased approach. To view the milestones for each phase please see the page Phases and Milestones.

Example messages

To better visualize the protocol messages a few sample messages have been provided on the page Protocol Message Examples

 

  • No labels

26 Comments

  1. Look at protobuf primitive types. Using Variant to encode most of your integer types will be space efficient for most small values especially counters of internal storage in int32 or int64. It also removes arbitrary limits when used as length fields for String and Binary.

    String should be defined as UTF-8 encoded (UTF without qualifier is ambiguous). If we are not considering chunked encoding for byte arrays then at least the leading length field should be a Variant. A uint16 does not hold many Unicode characters after encoding as UTF-8 (16k - 64k depending on characters). Maybe it is reasonable to assume that no message attribute will be longer than this but why make an arbitrary limit.

    Same holds true for byte[] encoding length field.

     

      1. For message headers, inclined towards fixed length header. That makes easier to read message payload.
      2. String should be encoded as modified UTF-8. Updated doc.
      3. We have made provision to chunk value part.

            

      1. I think we should use regular UTF-8 instead of modified UTF-8. Java supports it, and it will make our life much easier everywhere else. Using modified UTF-8 is just asking for bugs, since no one but java supports it (because no one else uses UTF-16).

  2. I don't understand the rational for having a JSON meta-data flags. Can you please explain why such a think would need to be define at the protocol level? This strikes me as very application level.

    1. Here we want to interpret JSON string into pdx. That way client can query region data and work with PdxInstance(i.e. JSONFormatter). And at the same time, the client process can work with JSON string. 

  3. FunctionRequestType 7

    Is this a mistake since it has the same ID as another request type?

  4. Protobuf takes the approach of decoupling the metadata for message from the message encoding in that a field 'a' defined as int64 may be encoded as either variant or fixed width 64-bit.

    Consider

    message M {
    int32 a = 1;
    }

    could be encoded as 

    [0 (variant), 1]

    or

    [1 (fixed-64), 0,0,0,0,0,0,0,1]

    or

    [5 (fixed-32), 0,0,0,1]
     
  5. The Protobuf approach is similar in some ways to how peer-to-peer messaging works, forming Message objects that are data-serialized to output streams.  Data-serialization inserts its own metadata concerning the encoding of data types.

  6. I really like the protocol page from Kafka (https://kafka.apache.org/protocol) and one thing that I believe could be added here from there is "retriability" for error codes.

    Also, are there any considerations of breaking down client/server as part of this effort ? (possibly geode-server.jar / geode-client.jar)  I can see how this could be more tied to the efforts of modularity but I believe some of that would be reflected on the protocol as well!

    Great job so far! (smile) 

  7. When we talk about different encodings of keys and values, can a key and a value have different encodings within the same message? Can messages that allow multiple key-value pairs have different encodings within them?

    1. Not sure I understood this, but Geode supports many datatypes, out of those few are supported as region key.  All supported type are described here. Basically it needs java objects who implements equals/hashcode method properly.

  8. Even though Java uses a modified version of UTF-8, it would be better to use an unmodified version of UTF-8 for strings.

  9. Is there a reason the MessageHeader and Request/ResponseHeader is split? Both contain information which seems to be mandatory... So there is nothing that I can see that would warrant the splitting of the data.

    1. Message contains request/response, but if message is large it can be divided into sub-messages. Thus we need message-header for it.

      1. Is this not something the standard header could contain? Given that both the Request and Response could be large and potentially sub-divided?

         

  10. Would it make sense to split the 2 bytes for APIType and ErrorCode into 1byte=FunctionalType 1byte=operation/error

    FunctionalType: Region,List,Function,Query,Admin, etc...

    Operation for Region: put,putAll,get,getAll,etc...\

    ErrorCode for Region: RegionNotExistExcep,RegionDestroyedExcep,etc....

    What this approach will give us would be the ability to add new modules that provide new functionality. Rather than having to maintain a single large list of operations/errorcodes, each module can now specify the operations/errorcodes specific to it.

    This means that new functionality can be added to an existing module or a new module without having to change the core system to reflect those changes. i.e if new collections like Lists, Sets, Queues are added, they specify their own operation codes and errorCodes. These codes definitions live within the module.

  11. In Messages the RequestHeaders seem to be missing the field hasMetaData

     

  12. In ResponseHeader it says the response will have (ResponseTypeId | ErrorCode).  It seems like this should be only ResponseTypeId and that one of the ResponseTypeIds shouldbe ErrorResponse, which you currently have named Error Format.  I obviously find this section confusing (smile)

    Under what circumstances would a PutRequest result in an Error response instead of a PutResponse with a failure status?

  13. (edited - reduced new protocol overhead after discussion with Hitesh)

    I am looking at the difference in message size between a current Geode PutOpImpl message and what is currently proposed in this spec.  With a put(byte-array, byte-array) the current Geode PutOpImpl has 51 bytes (more in some circs) of overhead while this spec has only 26.  That is a reduction of 50%.  This is due primarily to the high overhead of Message Parts in the current protocol.  Message Parts are used to store things as mundane as a boolean and they have 5 bytes of overhead each.

  14. The link for "bytes" does not link back to anything concrete... It links to the Geode Data Types page, which does not contain a definition for "bytes" which makes it even more confusing.

  15. "You will notice that this document describes the protocol down to the level of byte-ordering and "bytes on the wire".  This is a description of the native serialization of the protocol that will be used for the initial implementation.  It is our intent to make this pluggable so that alternative serialization technologies may be used instead, such as Protobuf or Apache Thrift.  How that is done will be left as a goal for the architecture of the server component, with an additional goal of providing an IDL description of the protocol."

    I am concerned by the statement that "initial implementation" will be the encoding representation describe as an option in this document without first comparing the alternatives, the language generated solutions, and giving any evidence they don't meet the needs of scope of this work. The top priority for this new protocol is that it be easily implemented. We score a -1 for writing our own encoder and expecting others to do the same. 


    1. I believe the wording for this to be incorrect. Due diligence is being followed and different serialization frameworks, like Avro, Protobuf, MessagePack, etc are being tested to try and meet the requirement of "lowering the barrier of entry". What the testing needs to cover are things like garbage creation, latency, unnecessary buffer copies, etc... Some of the concerns have already been covered in Message Serialization and Transmission and as soon as the testing of the serialization frameworks has been completed, the results will be posted as part of this proposal.

      It is only in the extreme case where it is found that other serialization frameworks significantly impede performance or are detrimental to the overall system health by creating unnecessary garbage, that the option for writing one's own encoders/decoders would be considered. In addition to this, the vision is to have a pluggable message serialization (not data objects) layer, in order for users to have options when choosing the correct serialization framework for them. "....It is our intent to make this pluggable so that alternative serialization technologies may be used instead...."

       

  16. Not conforming to industry standards scores us another -1 with suggesting that those implementing our encoding must encode strings using the Java specific modified UTF-8 encoding. "Modified UTF-8" is not a standard and is only used in Java for internal serialization and JNI representation.

  17. Jacob Barrett, Is it very common to want to have a nul character in a UTF-8 string?  Having Java apps, such as the Geode server, scan and replace standard nul chars with Java's nul would require a scan of every string received or transmitted.  That would increase CPU cost on every message received or sent.

    I am trying to find any information on what packages like protobuf do with UTF strings - whether they perform this kind of modification or not when dealing with the Java language..

    1. Bruce Schuchardt, Java's in memory representation of String is in UTF-16 (standard). It is only in the internal serialization classes and JNI layer that they chose to use "modified" UTF-8 to deal with null termination issues. There is no additional cost to convert from standard UTF-8 to UTF-16 than there is from modified UTF-8 to UTF-16.