Even though Apache Ignite does not support co-existence of nodes with different versions in a single clusters, it does support version upgrade for existing persistent storage instances. Strictly speaking, we should support the following scenarios:
Version downgrade is not supported. Failover is covered by backing up the persistence storage before upgrading to a newer version.
PDS File layout is currently fixed and the code does not support any changes to it's structure. Data files from different caches (more precisely, cache groups) are stored in different directories. Within a single cache group, data from different partitions (including index partition) is stored in different files. Overall, the folder structure looks as follows:
|-cacheName-cache
||-part1.bin
||-part2.bin
||...
||-partN.bin
||-index.bin
|-cacheName2-cache
||-part1.bin
||...
Given that currently there is no way for online files structure migration, implementation of such a feature as tablespaces should preserve old files organization as well for compatibility reasons.
Currently, each page has the following generic data layout:
+------------------------+---------------------------+---------------------------------------+
| Page IO Type (2 bytes) | Page IO Version (2 bytes) | Other page data (page size - 4 bytes) |
+------------------------+---------------------------+---------------------------------------+
In the code, we have an abstract class PageIO
which represent data layout in any page. Each page IO type may have several versions, but all versions of the same page page type should have a common interface. This way the code working with a page will not depend on any specifics of the page layout, and page conversion can be done when an instance of the old page version is obtained. PageIO
instance is stateless and works with a locked page buffer, so for each pair (page type, page version)
there is a single instance of PageIO
.
Since WAL is an append-only structure, we do not need to maintain write compatibility with older version. Ignite only needs to read and recover the node state up to the failure point. Once the state is recovered, we can write new WAL records (note that downgrade is not supported).
WAL records (de)serialization is fully covered by the RecordSerializer
interface. Record serializers are also versioned and serializer version is written to the header of each WAL segment, which implies that each WAL segment is written using a single version of a serializer. When WAL segment is read, serializer version is read first and further deserialization happens using this particular serializer version. When serializer version changes, we force a WAL segment rollover to make sure different versions are not mixed up.
Adding a new record is usually a straightforward process and does not require a serializer version update. Given that an older version did not write this record and append-only nature of the WAL, we can simply add new record type and WAL will work out of the box. Changing record structure requires adding a new serializer version because we need to read both new and old record structure and write new record structure.
PDS compatibility tests framework is implemented in IgnitePersistenceCompatibilityAbstractTest which has several subclasses. General approach for such tests is to produce some PDS state on an older version and then check that a newer Ignite versions can be started using this state.