Status
Current state: "Finished"
Discussion thread: here
JIRA:
-
SOLR-15086Getting issue details...
STATUS
Released: no
Motivation
Solr's current backup/restore functionality has several frustrating limitations.
Current index backups are based on full snapshots. Snapshot-based backups are slow and expensive because they copy all the files of the index regardless of how little the index may have changed since the last backup. Some, much, or all of the backup may be spent transferring data already present in the backup repository - a needless inefficiency.
Solr backups are also unreliable because a successful backup does not protect against corrupted indices. A backup on a corrupted index will be successful as long as all the index files exist on disk. This means that keeping rolling backups over many days may not protect against a corrupted index unless custom scripts run checkindex on such backups (which can be prohibitively expensive).
Additionally, the two backup repositories which Solr supports (LocalFileSystemRepository and the deprecated HdfsBackupRepository) are disconnected from the cloud ecosystems many deployments now live in. There are many cheap and virtually infinite storage options such as S3, GCS and Azure Blob Store that are used widely by ops personnel, but which Solr has no support for. This often results in ops teams writing scripts to copy backups to and from these trustier blob stores.
Lastly, restoring the index to an existing SolrCloud collection is not supported by Solr. Instead, the restore command must create a completely new collection. This means that to avoid having clients change the endpoint for indexing or querying, users are forced to create an alias that points to the source collection and invoke a collection API to point that alias to the newly restored collection.
This SIP aims to address each of these pain points individually. It proposes that backups be changed to be:
- Incremental i.e. only the changed data is saved instead of a full copy. This allows users to save on storage costs as well as significantly speed up the backup process if only small changes have happened since the last backup.
- Cloud friendly i.e. supports one or more blob storage systems available in major public clouds such as Amazon S3, Google Cloud Storage, Azure Blob Storage etc.
- Safe against index corruption i.e. backups should succeed only if the backed up index is not corrupt
- Restorable to existing collections i.e. it should be possible to restore to the source collection (or any existing collection, assuming it is compatible with the source)
Observant readers might recognize (1) and (3) from code that Cao Manh Dat proposed in SOLR-13608. In that sense this SIP is a superset of that ticket, created to cover a broader swath of functionality and generate more discussion/review of the design. Both Dat's ticket and this SIP are informed by code written by Dat, Shalin Mangar, and others, which is available in rough form here
Public Interfaces
The proposed changes involve changes to several different levels of public interfaces.
- At the HTTP API level it proposes slight changes to the Backup and Restore APIs (at both the Collection and Core Admin layers). It also proposes the introduction of two wholly new backup APIs, a "list backups" API, and a "delete backup" API.
- At the Java API level it proposes changes to the interfaces used to define backup repositories. Notably: `BackupRepository`.
- At the file-format level this SIP proposes a new format for storing backups on disk, which is a public interface in a very limited sense.
See the "Proposed Changes" section below for details on how these interfaces are likely to change.
Proposed Changes: Overall
This SIP proposes a handful of related changes to Solr's backup/restore functionality. These changes will be discussed as they relate to the four motivations mentioned in the "Motivation" section above.
Restore to existing collections
Solr can support restoring to existing collections by making use of the "read only" mode that was introduced in SOLR-13271. The restore API can put the target collection in read-only mode, restore a backup for each shard, and then toggle off "read only" mode.
Safety against corruption
Corruption-checks can be done by computing a checksum for each index file as it is read and prepared for uploading to the backup repository. This checksum can then be compared to the checksum stored at the end of each Lucene index file. If the checksum's don't match the backup can be aborted.
This ensures that files are uncorrupted when they are initially backed up. It also saves on the expensive, full-file reads that would be necessary to compute a checksum by computing the checksum when we're already reading the file for backup. This method however does not protect against the case where an existing file is corrupted _after_ backup.
Note that unused files in the backup repository may be left over e.g. if some files from this backup or from other shards had already been written. Those can be cleaned up using a purge operation.
Cloud friendly
This SIP proposes Solr add backup repositories for various common cloud providers (including Amazon, Google, and Microsoft). These will be targeted initially as contribs, though it's possible that testing concerns (uncertain access to free storage and other resources on each cloud-platform) may require these repositories be maintained as 3rd-party plugins instead. This SIP proposes starting this support with repository's for S3, GCS (Google Cloud Storage), and ABS (Azure Blob Storage)
Cloud Friendly: Eventual Consistency Concerns
The blob stores provided by public cloud providers such as AWS, Azure and GCP have similar APIs. They all provide get, put and list operations. However, the guarantees behind those APIs are different. GCS is strongly consistent and all three operations mentioned previously behave just as local file systems. However, S3 and Azure Blob Store provide eventual consistency with some caveats e.g. S3 has read-your-writes consistency for get after put but a get-put-get operation can return stale data. Similarly, the put-put-get operation is also eventually consistent. The list operation is eventually consistent on both S3 and Azure Blob store.
Therefore, a backup and restore implementation designed to work with blob stores must assume the least common denominator of consistency guarantees. In particular, it must assume that:
- List Directory operation can return stale or incomplete data
- Files cannot be overwritten
Listing files in a directory is a common operation in backup and restore. However, the list of files are usually well known at write time. Therefore, we write a manifest file per backup, per directory (if needed) once all files in the directory have been written. This manifest lists the files that are part of the manifest (or directory). The list operation of the backup repository for blob stores can use the manifest file to return the list of files consistently. This is similar to how Lucene writes segment files at the end.
Cloud Friendly: Licensing
Each public cloud provider has Java clients for their respective blob stores. GCS and S3 clients are licensed under ASL 2.0 while the Azure SDK for Java is MIT licensed. Therefore, there are no licensing issues in using them inside Apache Lucene/Solr.
Incremental
Regardless of the BackupRepository in use, this SIP proposes that backups be taken in an incremental manner, so that only those index files not stored by previous backups will be stored for the given backup. This will result in changes to the format of each backup. The general thrust of these changes is that a given backup "location" can (and should) be used to store multiple backups, and that backup includes a metadata file used to indicate which Lucene index files are a part of the backup and the path to each of these within the umbrella backup "location".
Proposed Changes: Backup File Format
For details on the specific backup file format being proposed and how it enables incremental backups to be done accurately, see the SIP sub-page dedicated to this topic here.
Proposed Changes: HTTP API
As mentioned above this SIP proposes small tweaks to the existing backup and restore APIs. These are described in more detail below.
Collections API
Backup API
The backup API would largely remain the same except for the addition of an optional new parameter: `maxNumBackup`. This parameter would be an integer that limits the maximum number of backups to keep. e.g. if maxNumBackup=5 then if number of backups in the provided location is more than 5, the oldest ones are deleted until only 5 backups exist.
Additionally the semantics of the location parameter would change slightly with the change to incremental backups. Currently "location" must be a new (empty) directory. But with the proposed incremental backup changes this requirement would fall away, and it would in fact be recommended that the same location be used for all backups of a given collection.
Example backup request:
/admin/collections?action=BACKUP& name=myBackupName& collection=myCollectionName& location=/path/to/my/shared/drive& maxNumBackup=5
POST /v2/collections { "backup-collection": { "collection": "myCollectionName", "name": "myBackupName", "location": "/path/to/my/shared/drive", "maxNumBackup": 5 } }
Example backup response:
{ "responseHeader":{ "status":0, "QTime":61}, "success":{ "127.0.0.1:60324_solr":{"responseHeader":{ "status":0, "QTime":31}}, "127.0.0.1:60323_solr":{"responseHeader":{ "status":0, "QTime":31}}}, "collection":"myCollectionName", "numShards": 2, "backupId":0, "indexVersion":"8_4", "startTime":"2019-08-28T16:03:19.127Z", "indexSizeMB":0.004, "shards":{ "shard2":{ "startTime":"2 019-08-28T16:03:19.127Z", "indexFileCount":17, "uploadedIndexFileCount":17, "indexSizeMB":0.003, "uploadedIndexFileMB":0.003, "endTime":"2019-08-28T16:03:19.155Z", "shardBackupId":"md_shard2_id_0", "node":"127.0.0.1:60324_solr"}, "shard1":{ "startTime":"2019-08-28T16:03:19.127Z", "indexFileCount":17, "uploadedIndexFileCount":17, "indexSizeMB":0.003, "uploadedIndexFileMB":0.003, "endTime":"2019-08-28T16:03:19.155Z", "shardBackupId":"md_shard1_id_0", "node":"127.0.0.1:60323_solr"}}}
The response shows important information about the backup such as the total file count and index size as well as the actual number of files and index size that was backed up in this incremental backup. It also returns the backup ID that can be used to restore to this backup.
Restore API
The restore API is pretty simple as it does not need to create a new collection anymore. One new parameter is added:
- backupId: An integer ID of the backup to be restored. This is optional. If no backupId is provided then the last backup is restored
The old style restore API that creates a new collection can still be used for compatibility.
Example request:
/admin/collections?action=RESTORE& name=myBackupName& location=/path/to/my/shared/drive& collection=myRestoredCollectionName& backupId=10
POST /v2/collections { "restore-collection": { "collection": "myRestoredCollectionName", "name": "myBackupName", "location": "/path/to/my/shared/drive", "backupId": 10 } }
Delete Backup API
This is a completely new Collection API to delete a backup, rotate backups and/or purge unused files left behind by a failed backup operation. It supports the following parameters:
- name - A string backup name. This backup name is resolved against the given location. This is a required parameter.
- location - A string location of the backup. This is a required parameter.
- repository - A string name of the repository to be used. This is optional. If not present, the default repository configured in the solr.xml is used.
- Exactly 1 of the following parameters, which identify the backup data to be deleted.
- backupId - An integer backup ID whose files have to be deleted.
- maxNumBackup - An integer that limits the maximum number of backups to keep e.g. if maxNumBackup=5 then if number of backups in the provided location is more than 5, the oldest ones are deleted until only 5 backups exist.
- purge - A flag used to turn on 'purging' of any potentially "orphaned" files that are not part of any backup and therefore should be deleted.
At a given time, only one of backupId, maxNumBackup and purge parameters should be specified.
/admin/collections?action=DELETE_BACKUP& name=myBackupName& location=/path/to/my/shared/drive& backupId=<number of backupId>
POST /v2/collections/backups { "delete-backup": { "name": "myBackupName", "location": "/path/to/my/shared/drive", "backupId": 5 } }
Example response when deleting a particular backupId:
{ "responseHeader" : { ..}, "collection" : "collection1", "deleted" : [ { "backupId" : 2, "startTime" : "2019-08-27T09:11:17.230673Z", "size" : 9581, "numFiles" : 52 } ] }
Example response for purge:
{ "responseHeader" : {..}, "collection" : "collection1", "purged" : { "numIndexFiles" : 2 } }
List Backup API
This is also a new Collection API to list the existing backups for a collection. Unlike the other Collection APIs described above, this API does not need to go through the overseer and can be answered by any node in the cluster. The following parameters are supported:
- name - A string name of the backup (usually the collection name)
- location - A string location of the backup. This is resolved against the repository.
- repository - An optional string to identify the repository. If none is provided, then the default repository configured in solr.xml is used.
/admin/collections?action=LISTBACKUP& name=myBackupName& location=/path/to/my/shared/drive
POST /v2/collections/backups { "list-backups": { "name": "myBackupName", "location": "/path/to/my/shared/drive" } }
Example response:
{ "responseHeader":{ "status":0, "QTime":1}, "collection":"backuprestore_testbackupinc", "backups":[ { "indexFileCount":26, "indexSizeMB":0.004, "shardBackupIds":{ "shard2":"md_shard2_id_2", "shard1":"md_shard1_id_2"}, "collection.configName":"conf1", "backupId":2, "collectionAlias":"backuprestore_testbackupinc", "startTime":"2019-08-28T16:02:11.485Z", "indexVersion":"8.2.1"}, { "indexFileCount":2, "indexSizeMB":0.0, "shardBackupIds":{ "shard2":"md_shard2_id_3", "shard1":"md_shard1_id_3"}, "collection.configName":"conf1", "backupId":3, "collectionAlias":"backuprestore_testbackupinc", "startTime":"2019-08-28T16:02:14.375Z", "indexVersion":"8.2.1"}, { "indexFileCount":2, "indexSizeMB":0.0, "shardBackupIds":{ "shard2":"md_shard2_id_4", "shard1":"md_shard1_id_4"}, "collection.configName":"conf1", "backupId":4, "collectionAlias":"backuprestore_testbackupinc", "startTime":"2019-08-28T16:02:14.406Z", "indexVersion":"8.2.1"}]}
CoreAdmin APIs
Backup Core API
This is supposed to be an internal API to be called by the Backup Collection API. It supports two new parameters:
- shardBackupId - (Required) The shard backup ID assigned by the Backup Collection API for the current backup.
- prevShardBackupId - The previous shard backup ID against which the incremental backup is to be made. The previous shard backup is used as the base to find changed data.
admin/cores?action=BACKUPCORE& core=core-node1& location=/path/to/my/shared/drive/myBackupName& prevShardBackupId=md_shard1_id_0 shardBackupId=md_shard1_id_1
POST /v2/cores/someCoreName { "backup-core": { "location": "/path/to/my/shared/drive/with/backupName", "shardBackupId": "md_shard1_id_1", "prevShardBackupId": "md_shard1_id_0" } }
Restore Core API
This is also an internal API to be called by the Restore Collection API. It supports two new parameters:
- incremental – An optional boolean that signals whether the data being restored is in the "incremental" format or not. Defaults to false.
- shardBackupId - The shard backup ID to be restored. This is a required parameter if incremental=true is specified.
admin/cores?action=RESTORECORE& core=core-node1& incremental=true& location=/path/to/my/shared/drive/myBackupName& shardBackupId=md_shard1_id_1
POST /v2/cores/someRestoreCoreName { "restore-core": { "incremental": true, "location": "/path/to/shared/drive/with/backupName", "shardBackupId": "md_shard1_id_1" } }
Compatibility, Deprecation, and Migration Plan
There are 3 main areas where compatibility and deprecation must be considered: HTTP APIs, file format on disk, and Java APIs.
HTTP API Compatibility
As written, this SIP has few API changes that merit compatibility concerns. Request and response changes are primarily additive - new parameters or response sections are added but existing response sections are left as-is.
Java API Compatibility
This SIP may involve changes to Java interfaces that Solr exposes for plugin development - particularly BackupRepository and others used to define the physical storage mechanism that backups are written and read from. These changes will be made in accordance with the community's standard back-compatibility policy. (New behavior goes in new methods, existing methods are sunset with a deprecation warning and eventually removed at the next major release)
File Format Compatibility
In adding a new format for backups, we need to consider how long to support reading and writing the old format. Incremental backups are always preferable to their full-snapshot counterparts, so I propose that Solr drop support for creating snapshot-based backups as soon as incremental backup support is available, but that the ability to read the snapshot-based format be retained for the entire duration of the subsequent major release line. (i.e. if incremental-backup support was added in 8.8, Solr would maintain the ability to restore snapshot-based backups through 9.x)
(If the community disagrees and wants Solr releases to support creation of both types of backups simultaneously, the existing "repository" API parameter can be used to disambiguate the type of backup to be created.)
Example (assumes SIP is completed by Solr 8.9.0): UserA runs a single-node 8.7.0 cluster and creates regular backups for their collections. When 8.9.0 is released they perform one final snapshot-based backup and upgrade their cluster. Shortly after upgrading their harddisk fails. They are able to restore the old snapshot-based backup by using Solr's backup API. As time goes on they can take backups with the same API call they've used previously, though the files on disk for each of these are now in the incremental-backup format.
Test Plan
Much of this SIP can be tested as any other Solr functionality. The API changes, the framework for backups and restoration, the new restoration to existing collections functionality, and our current "BackupRepository" implementations (HDFS and local file system) can all be tested in the usual way as JUnit tests. These will likely be built off of a modified AbstractCloudBackupRestoreTestCase. Tests for the API can use LocalFileSystemRepository without any burdensome setup.
Testing the public cloud-provider BackupRepository's (Azure, AWS, etc.) will be more difficult. Automated testing for these will require a test library for stubbing out each cloud provider (a la moto or parkplace for S3) Alternatively, we could try to obtain a grant of resources from each cloud provider to help us test this functionality.
Rejected Alternatives
N/A. For a discussion of why the SIP takes the form it does - particularly with respect to the proposed file format - see the "Proposed Changes" section above, esp. the "Eventual Consistency" subsection.