Portable object serialization is first released in Apache Ignite 1.4 as an experimental feature and is not available form public API.
Some critical Ignite features like indexing/queries require object introspection without deserialization. To achieve this portable protocol writes object fields as follows:
[X ..X+4] - Field 1 ID
[X+4..X+8] - Field 1 length
[X+8..Y ] - Field 1 content
[Y ..Y+4] - Field 2 ID
To read a field we get it's ID and search for it starting from the object head. This gives good performance and data locality for regular serialziation/deserialization cycles because normally fields are written in the same order as they are read. As a result usually we can find the field during read in O(1) time.
void writePortable(PortableWriter w) { w.write("A", a); w.write("B", b); } void readPortable(PortableReader r) { a = r.read("A"); b = r.read("B"); }
To the contrast, indexing engine usually read fields in random order with O(N) complexity on average.
In Ignite 1.5 we need to change the protocol so that field position could be found in ~O(1) on average.
The following changes are proposed:
Resulting object layout (unrelated header pieces are ommited):
[0 .. 4 ] - Footer offset; could be zero if the whole object is written in raw mode.
[4 .. 8 ] - Schema ID. Absent in case footer offset is zero.
[8 .. X ] - Field 1.
[X .. Y ] - Field 2.
[Y .. Y+4 ] - Footer start, field 1 ID.
[Y+4 .. Y+8 ] - Field 1 offset.
[Y+8 .. Y+12] - Field 2 ID, etc.
We define each unique set of written fields as schema. The following example demonstrates two schemas:
void writePortable(PortableWriter w) { w.write("A", a); w.write("B", b); if (b) w.write("C", c); // Schema 1: [A->B->C]; else w.write("D", d);'// Schema 2: [A->B->D]; }
Each schema consists of:
Known schemas are stored in read-only structure. If new schema is detected during read or write, it is updated atomically. Normally object will have only 1 schema, 2-3 schemas in rare cases, >3 schemas in very rare cases. For this reason we can store them in volatile array or so.
Schemas are stored inside existing type descriptor. This way we avoid additional hash map lookups.
No or almost no additional overhead is expected comparing to Ignite 1.4 after warmup.
No or almost no additional overhead is expected comparing to Ignite 1.4 after warmup.