Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Below new APIs need to be added to Sidecar to facilitate data transfer.

List-files endpoint

Endpoint to list all instance files as URLs by a source, so that a sidecar can download the files using them. Along with a list of files to download, sending total bytes to download will help to check if the success threshold met to proceed further or not. Cassandra handles multiple types of directories for data (data_file_directories), commit log(commitlog_directory), hints(hints_directory), cdc (cdc_raw_directory) etc... 

Each file in those directories listed by sidecar roughly looks like this: "<api_version>/migration/files/<type_of_directory>/<optinal_index>/ + relative" path from the directory.

For example:

If the cassandra.yaml has the config like:

data_file_directories:/var/lib/cassandra/data

commitlog_directory:/var/lib/cassandra_commitlog

And if source has the following files

/var/lib/cassandra/data/ks1/table1/1_Data

...

.db
/var/lib/cassandra_commitlog/1.log

Then the files list exposed by source side car will have the following urls

/api/v1/migration/files/data_file_directories/0/ks1/table1/1_Data.db

/api/v1/migration/files/commitlog_directory/1.log

Notes:

  • data_file_directories can have a list of directories. Hence using index to identify and separate each directory.
  • This endpoint can also accept a list of files/directories to exclude so that only desired files are listed.

Data copy task endpoints

New endpoints will be introduced to create, get and cancel (if required) data copy tasks

...

from the Sidecar. Each data copy task will make use of a file downloader. File downloader takes the responsibility to get the list of files from the source, remove unnecessary local files, validates the integrity of existing files and downloads the files from source. File downloader doesn't make assumptions about how the source is going to list the files, for it, it is just an URL to download. It also makes use of a location mapper (which can be plugged-in) to map source file urls to local file paths. Since files need to be copied to relatively the same location at the destination too, local paths remain unchanged after the directories path configured in cassandra.yaml. This endpoint can also accept a list of files/directories to exclude so that only desired files are listed.

File download/transfer endpoint

...

To download a single file

File digest endpoint

...

Returns digest of a single file specified in the path

Security

The proposed endpoints are required only at the time performing migrations otherwise they are not enabled by default. These endpoints should be enabled on demand. There is no point in making them available when a migration is not in progress and enabling for instances for which no migration is planned. To facilitate it, a request filter will be introduced for these endpoints which will allow/deny requests. How the filter is configured goes to the implementation details.

...