IDIEP-135
Author
Sponsor
Created

 

Status

DRAFT


Motivation

Users with Apache Ignite 2 clusters will require a reliable process to move their existing data into an Apache Ignite 3 cluster.

This tool will focus only on data stored in Apache Ignite 2 persistent caches. Migration of memory-only data is out of scope. The conversion from Apache Ignite 2 caches to Apache Ignite 3 table schemas can be done based on cache configurations.

Installing and configuring an Ignite 3 cluster with similar configurations/topologies to an existing Ignite 2 cluster is outside the scope of this document.

Description

This feature will be offered as a CLI tool that will roughly follow this process:

  1. Scan one work directory of an Apache Ignite 2 node.
  2. Discover which caches are available in the node.
  3. For each cache:
    1. (Optionally) Recreate the available caches in the Apache Ignite 3 cluster.
    2. Iterate over all the available partitions in the work directory.
    3. Convert/Map the Apache Ignite 2 CacheDataRows into Apache Ignite 3 Tuples.
    4. Use the DataStreamerTarget API to insert the tuples into a table.

The steps will be discussed in more detail ahead.

Components and Steps

Apache Ignite 2 Persistence Reader

This component must use a stripped-down version of Ignite 2 to read the persistence files without starting a full cluster. For instance, the network capabilities of Ignite are not required for this use case. The StandAloneGridKernalContext provides an example of a stripped-down version of the AI2 GridKernalContext, which can be used as a reference implementation.

The context should correctly load/restore the:

  • Marshaller context
  • Stored cache configurations
  • Partition Files

This component should not attempt to recover data from the Write Ahead Log (WAL) or execute any other maintenance process over the persistence files.

SQL DDL Generator

The target tables need to be explicitly created before streaming data to the Ignite 3 cluster. Unlike the previous version, Ignite 3 only allows creating tables explicitly using the SQL API. As a result, we need a component to translate the Ignite 2 cache configurations into a SQL `CREATE TABLE` statements.

The Ignite 2 cache configuration can be found in the node’s persistent work directory or the cluster configuration. In any case, the correct way to fetch these configurations would be through the GridKernalContext API after the context finished reading the metastore.

The Query Entities (QEs) in the cache configuration have information about key fields, which must be annotated with the primary key constraint. It also supports other constraints like ‘NOT NULL’, these should also be respected. Aliases should also be supported. 

Predicted Cache Configurations Scenarios

We have identified the following cache configuration scenarios:

Scenario 0:

  • Cache has just one Query Entity (QE)
  • All the records (CacheDataRows/BinaryObjects) have exactly the same attributes (Columns) as defined in the  Query Entity (QE).
  • All the record’s attributes are valid SQL Native types.

Scenario 1:

  • Cache with no Query Entity (QE) defined.

Scenario 2:

  • Cache with multiple Query Entities defined.

Scenario 3:

  • Cache with a single Query Entity (QE) defined.
  • Some records do not match the defined Query Entity (QE). These records might have additional attributes or miss some attributes defined in the QE. Additionally, the record attributes might not be compatible with the datatype defined in the QE or might not be a valid SQL Native type. For example, the Organization’s cache in the examples. 

Given each scenario, the SQL generator must:

Scenario 0: 

  • Directly map to the table schema every attribute (name/type/constraints) from the original QE.
  • When the QE Key or Value are simple native Java (int, long, String, etc.) they will not likely have an explicit field name in the QE. In this case, the field name should be selected from a list of candidates, ensuring that the selected candidate does not collide with an explicitly defined attribute name.

Scenario 1:

  • Define a table with two Binary attributes (ID: byte[], VAL: byte[]). The attribute names should be the first candidates from the list since there are no other fields to cause collisions.
  • The topic of mapping complex objects to byte arrays is slightly more complex. The easiest and most versatile approach would be to serialize the Objects in the CacheDataRow. We should use a widely available serialization format like JSON to make sure that custom client applications can easily access them. In step 4 we'll need to convert binary objects to JSON and store the JSON string.

Scenario 2: 

  • AI3 doesn't support this directly, and this is very rare and frowned upon even in AI2. We can safely leave this unsupported until someone asks us to support it.

Scenario 3: 

  • Generate the table schema according to the same algorithm as Scenario 0.
  • Unfortunately, there is no static method to ensure that all the records comply with the Query Entity (QE). We may implement a `dry-run` operation mode to assert this without moving data. Albeit, this would probably be an expensive operation as well.

The SQL Generator should also be exposed as a separate CLI command to enable the client to manually tune the table schema before migrating data.

CacheDataRow to AI3 Records Mapper

The mapping between CacheDataRows and Ignite 3 Tuples can be done in two steps. First, the CacheDataRow key and value (KeyCacheObject and CacheObject) must be adapted into a tuple. This can be easily done after analyzing the CacheObject implementation. For BinaryObjectImpl instances, it is very easy to map the Row field names by retrieving the BinaryObjectImpl ‘rawType’. On the other hand, native Java types might not have a defined field name in the CacheObject, as discussed previously. In this case, we can map its value to a placeholder and resolve the actual field name in the next step.

The second step is to map the original field names to the target table schema. The schema can be retrieved using the ‘ClientTable#getLatestSchema’ method. During this step, mismatches between the Tuple’s field names or types and the schema can result in undefined behavior, which implies that we are dealing with Scenario 3 or that the schema was altered erroneously. As a result, we should allow the user to choose what to do with the mismatched columns/fields:

  1. Ignore the mismatched field/columns (possible data loss)
  2. Skip the mismatched records (possible data loss)
  3. Abort the migration procedure, which allows the user to change the table schema to fix the mismatched fields manually.
  4. Pack extra or mismatched fields in the CacheDataRow into an additional binary column. This option would prevent data losses from the migration by serializing mismatched fields into an additional column. Clients could use the data in this field to fine-tune their Ignite 3 tables after the migration.

In either case, a detailed error message should be presented to the user.

Datastreamer API

After mapping the Tuples, the DataStreamerTarget interface will be used to publish to the Ignite 3 cluster.

Command Line Interface

Common parameters

parameterdescription
--ignite2-directory

Work Directory of one or many Ignite 2 nodes

--ignite2-node-consistent-id

Consistent ID of the Ignite 2 node

--ignite2-config-xml

Ignite 2 configuration XML

List Persistent Caches

command specdescription
migration-tools cache list <--ignite2-directory> <--ignite2-node-consistent-id> <--ignite2-config-xml>

List available caches on the node

Migrate Persistent Cache

command specdescription

migration-tools cache migrate <--name> <--ignite2-directory> <--ignite2-node-consistent-id> <--ignite2-config-xml> <--cluster-endpoint-url> [--mode] [--rate-limiter] [--no-save-progress] [--resume-from] [--retry-limit] [--retry-backoff]

Migrate a cache from an Ignite 2 work dir into an Ignite 3 cluster

Parameters

Parameterdescription
--nameThe cache name to migrate
--cluster-endpoint-urlThe URL of the Ignite 3 cluster; Must have at least one argument.
--modeMapping error handling policy: ABORT (Default), SKIP_RECORD, IGNORE_COLUMN, PACK_EXTRA
--rate-limiter

Limits the number of migrated records per second. Uses a very basic rate limiter implementation, and may be prone to bursts.

--no-save-progress

Disables saving a progress file at the end of the run.

--resume-from

Resume the migration based on the progress file provided.

--retry-limit

Retries the migration up to N times on retrievable errors. 0 (Default) does not retry. Implies save
progress is not disabled.

--retry-backoff

Waits N seconds before retrying the next attempt at migrating the cache. Default: 0 (retry immediately).

Generate SQL DDL Script

command specdescription
migration-tools cache ddl <--ignite2-directory> <--ignite2-node-consistent-id> <--ignite2-config-xml> [–-name]

Generate the corresponding SQL DDL creation script for persisted cache configurations

Parameters

Parameterdescription
--name

Generates the DDL statement for the provided cache. By default, all cache statements are generated

Distribution

The migration tools CLI should be distributed in a similar Zip package as the main Ignite CLI. The distribution should provide two binaries, for Windows and Unix-based systems.

Risks and Assumptions

The methods discussed in this document assume that all the applications and workloads have been stopped and the cluster has been shut down gracefully, meaning that the latest checkpoints contain the latest data before attempting to restore the Apache Ignite 2 data.

The migration cache process also requires a live Apache Ignite 3 cluster, the target cluster. Installing and configuring this cluster is out of the scope of this document. The migration tool should be able to recreate the caches/tables automatically for most use cases, using the SQL DDL Generator. However, the client may optionally create or alter the tables before running the persistent data migration tool. Nonetheless, the success of the migration is dependent on the compatibility of the AI3 table schema with the mapping process.

We decided not to rely on Snapshots for this tool.

Discussion Links

// Links to discussions on the devlist, if applicable.

Reference Links

// Links to various reference documents, if applicable.

Tickets

Key Summary T Created Updated Due Assignee Reporter P Status Resolution
Loading...
Refresh

  • No labels