IDIEP-132
Author
Sponsor
Created

28.11.2024

Status

IN PROGRESS


Motivation

For now, Ignite cluster can be upgraded only by full cluster restart.

Procedure must be the following:

  1. Stop ALL cluster nodes.
  2. Update files on each node.
  3. Start nodes one by one.

Each version update means cluster unavailability for the end user.
This is extremely inconvenient, especially for the users who use Ignite as a primary data storage.
Other systems supports rolling upgrade feature when upgrade made node by node without unavailability period.
After implementing rolling upgrade, Ignite must support the following upgrade procedure:

  1. Stop one node.
  2. Upgrade files on the node.
  3. Start node.
  4. Repeat steps 1-3 for each node in cluster.

Let's see how rolling upgrade implemented in other systems:

Other systems

Many distributed open source systems has rolling upgrade feature, already.
We must study implemented approaches to gain some insights from them.

  1. Is there rolling upgrade feature?
  2. How it implemented?
  3. How it tested?
  4. When it tested: Each PR? Weekly? Only on release?
  5. Network message format.
  6. Serdes implementation.
  7. How many earlier releases can be upgraded.

Name

Supported

How it implemented?

How it tested?

When it tested?

Network message format

Serdes implementation

How many earlier releases can be upgraded?

Apache CassandraYes (1), (2)Server-client compatibility works similar to Ignite Thin Client protocol.

Compatibility checked with the special test framework (3)
At a first glance, there are no special source code checks to ensure compatibility in day by day coding

On release or by requestPOJOCustom serdes implementation.
Apache KafkaYes (4)

Message formats checked on PR reivew. 
At a first glance, there are no special source code checks to ensure compatibility in day by day coding.
But, all machinery to code compatible implemented in code generation framework (5)

Compatibility checked with the special test framework ducktest (6)

On release or by request.
Kafka doesn't provide public resources to run ducktests. Run done by contributors or by confluent employers on private hosts

POJOCustom serdes implementation.All
YugabyteYes (7)

checked on review (see commit message section "Upgrade/Rollback safety") (8)



plain objectsProtobuf
YDBYes

Message formats checked on PR rivew - grpc+protobuf helps to maintain compatibility

Compatibility checked with the special test framework (9)

On request.

plain objectsProtobuf
Cockroach DBYes




plain objectsProtobuf
HazelcastYes (12)








Description

Rolling Upgrade assumes that cluster can consist of Ignite nodes of different versions.
So Ignite must provide backward compatibility on network level and ability to work in mixed topology.
Development time checks, tests must be added to provide protection of incorrect patches.
Ignite codebase must be reorganized in a way to clearly distinguish those parts that require compatibility and those who don't.

Let's define clearly, what backward compatibility for the network messages means:

  • New Ignite server MUST be able to read previous version of message.
  • Old Ignite server MUST be able to read new version of message
    • Newly added fields MUST be ignored.
    • Removed fields MUST have default values.

Note, there are guide for PDS compatibility, already (11).

Let's list subsystems that must be reworked to provide compatibility:

  • communication: Communication messages consists of two parts:
    • Message format: message format itself. Communication API should be reworked to force backward compatible messages.
    • User data: Message can store user data. User data format must be backward compatible.
  • discovery:
    • Message format: message format itself.
      Discovery API must be reworked to force backward compatible messages.
  • binary marshaller:
    • Currently, binary marshaller code highly coupled with the other Ignite code.
      We must modularize Binary infrastructure (10) and provide compatibility guarantees for each part of it.
  • features:
    • Framework to enable/disable features for mixed version clusters must be developed.
  • affinity: 
    • Affinity function must the same for each online node version.
  • management commands
    • arguments
    • results
    • tasks

All subsystem that must be compatible:

  1. PDS
    1. partition data
    2. WAL records
  2. Metadata
    1. binary meta
    2. marshaller
  3. Communication SPI
  4. Discovery SPI
  5. Binary Marshaller
  6. Affinity.
  7. Management API
    1. argument 
    2. results
    3. tasks (class names).
  8. ThinClient Protocol

Compatibility Matrix

ComponentTypeNumber of releases
Ignite public APIbackward1
PDSbackwardall
WALbackwardall
Metadatabackwardall
Thin Clientfullall
Communication SPIfull1
Discovery SPIfull1
Binary Marshallerbackwardall
Management APIbackward1
Affinityfullall
JDBCbackward1
ODBCbackward1
Restbackward1
CDCfull1

Current Ignite codebase

  • Is there any primitives, building blocks to provide compatibility?
  • Difficulties for day by day coding.
    • java serialzation
      • anonymous class names based on declaration order
  • Compatibility testing.
    • explicit tests.
    • ability to run tests with random node versions
  • New feature implementation, enabling.
  • patterns for implementing commons cases: new version of algorithmes, testing against new versions.
  • Scope of compatibility: 
    • Currently any third-party module can register own ports (messages?) and must be able to track compatibility. 

Implementation phases

Phase 0 - Code Cleanup

  • remove DirectByteBufferStreamImpl  V1-V3, keep V4, only.
  • remove all items from IgniteFeatures  and corresponding checks.
  • remove all code and checks for GridContinuousProcessor#discoProtoVer
  • TcpDiscoverySpi#setForceServerMode
  • All old version of classes - keep only max from V2, V3, etc. versions. StartRequestV2 that belongs to internal communication.
    Classes releated to thin client, jdbc, odbc interaction must stay.
    • StartRequest must be deleted. Rename StartRequestV2 → StartRequest.
    • CacheMetricsSnapshot must be deleted. Rename CacheMetricsSnapshotV2 → CacheMetricsSnapshot.
  • MessageFactory.
  • IgniteDataTransferObject
  • EXCHANGE_PROTOCOL_2_SINCE and related code

  • IgniteProductVersion.fromString - usages

Phase 1 - BinaryMarshaller modularization

  • modularize BinaryMarshaller.
  • create small jar for ignite thin client
  • provide clear API for binary objects inside ignite-code and other modules.

Phase 2 - Communication SPI Compatibility

  • Communication MessageWriter and MessageReader are aware of peers version, and then schema of a message.
  • Distinguish serdes generation from POJO
  • PR checks for Messages  ancestor change with the some warning, labels, etc. that can draw reviewer attention to the possible compatibility issues.

Phase 3 - Discovery SPI Compatibility

  • (question) Serdes changes to provide compatibility? (current approach with JDK serialization provides some compatibility (13).
    Is it enough for long-term compatibility support?
  • (question) Serdes changes to restrict classes that can be sent over SPI.
  • PR checks for Messages  ancestor change with the some warning, labels, etc. that can draw reviewer attention to the possible compatibility issues.

Phase 4 - IgniteFeatures 

  • Write down clear rules to deal with the new features and not compatible enhancements.
  • Support, if not, already this rules in IgniteFeatures framework.

Phase 5 - Management API

  • Provide ability to change arg, result classes in compatible way.
    • Possible approach is to reuse communication serdes framework.
    • Other possibility is to migrate on BinaryObject as a arguments and results.
  • PR checks for Messages  ancestor change with the some warning, labels, etc. that can draw reviewer attention to the possible compatibility issues.

Phase 6 - Testing

  • ducktests to check upgrade procedure.
  • (question) unit tests mode that start some nodes of previous version. 

Phase 7 - Development process changes

  • (question) Any change in public API MUST be in the form of IEP.
  • (question) Any change in public API MUST be voted by two(three?) committers.
  • Documentation with clear description of development rules for all subsystems required to be compatible.

Alternative designs

  • subsystem API versions (like in REST API)
  • runtime component(IgniteProcessor, IgniteManager) upgrade with the dynamic class loading. 

Reference Links

  1. https://www.datastax.com/learn/whats-new-for-cassandra-4/migrating-cassandra-4x
  2. https://docs.datastax.com/en/luna-cassandra/guides/upgrade/overview.html
  3. https://github.com/apache/cassandra-dtest/blob/trunk/upgrade_tests/README.md
  4. https://kafka.apache.org/documentation/#upgrade
  5. https://github.com/apache/kafka/blob/trunk/clients/src/main/resources/common/message/AlterPartitionResponse.json
  6. https://github.com/apache/kafka/blob/trunk/tests/kafkatest/tests/core/kraft_upgrade_test.py
  7. https://docs.yugabyte.com/preview/manage/upgrade-deployment/
  8. https://github.com/yugabyte/yugabyte-db/commit/e9ab17dea0d3b4f1673531c07404a290f9fbd8f2
  9. https://github.com/ydb-platform/ydb/blob/main/ydb/tests/functional/restarts/
  10. IEP-119 Binary infrastructure modularization
  11. PDS Compatibility Guide (WIP)
  12. https://hazelcast.com/products/rolling-upgrade/
  13. https://docs.oracle.com/en/java/javase/17/docs/specs/serialization/version.html#compatible-java-type-evolution

Tickets

Key Summary T Created Updated Due Assignee Reporter P Status Resolution
Loading...
Refresh

  • No labels