• Introduction
  • Purpose
  • References
  • Branch
  • Document History
  • Glossary
  • Use case
  • Feature Specification
  • Error Handling
  • Design Description
  • API Changes
  • DB Changes
  • UI Flow 
  • Hypervisors supported
  • Parallel operations that can happen during backing up of snapshot

Introduction

Currently, Volume Snapshots in Cloudstack take a considerable amount of time to complete thereby blocking other operations on the VM. This is because, the volume snapshot operation involves 2 steps - creating a snapshot of the volume on primary storage and then backing it up on secondary storage. This feature will separate the creation of the snapshot on primary and its copying onto secondary.

 

Purpose

This is a functional specification of the feature "Separate creation and backup operations for a volume snapshot".

References

JIRA Ticket

Branch

master

Document History

Author
Description
Date
Harika Punna
First RevisionMay 2nd, 2017


Glossary

  • Snapshot - Snapshot refers to volume snapshot if not specified specifically.

Use case

  • The user wants to separate the creation and backup operation of volume snapshot.

Feature Specification

  • As part of this feature, as soon as the snapshot gets created on primary, the user gets a notification, saying that the snapshot has been successfully created.

Error Handling

  • All errors at various levels of operations will be logged in management-server.log.
  • If the creation of snapshot fails before its done on Primary, the error handling will be same as the one happening now i.e. will be notified of the error that has happened.
  • If the backing up of a snapshot on secondary fails, the snapshot will move to Error state, which later will be cleaned by storage GC thread and its related entries in snapshot_store_ref table will be deleted as and when retry fails.
  • While backing up of a snapshot, If the management server is stopping/stopped, on the restart of the server, the snapshot will be moved to Error state and then be cleaned by GC thread.

.

Design Description

This feature will only improve the experience with volume snapshots.

  • If the snapshot is successfully created on primary i.e. when the status in snapshots table is "CreatedOnPrimary", the backup process starts and the result of snapshot creation will be returned. 
  • No operation will be performed on the snapshot which is on primary and is in backing up state.
  • The backup process if fails runs for "backup.max.attempts", with interval between attempts "backup.retry.interval". If within these number of attempts, the creation of backup is not successful, then the snapshot will be deleted from primary even.
  • A separate thread pool will be maintained for backup task.

    Configuration parameters: 

    Param Name
    Default Value
    backup.max.attempts3
    backup.retry.interval300 seconds



     

API Changes

An additional param will be added to CreateSnapshotCmd, on whose value the decision of, if to separate the snapshot and copy operations or if to continue with the existing one is decided.

API
Parameter
createSnapshot

asyncBackup(optional)

 

 

DB Changes

NA

 

UI Flow

A checkbox will be added to the "Create Volume Snapshot" dialog box, which when checked, snapshot and copy operations will be separated and if left unchecked the existing flow continues.

 

Hypervisors supported

 XenServer, KVM


Parallel operations that can happen during backing up of snapshot

 

  1. Volume operations
    1. Snapshot on same volume 
    2. Snapshot on different volume

  2. VM Operations
    1. Start VM
    2. Stop VM
    3. Destroy VM (only when data disks are backing up)
    4. Reboot VM
    5. Reinstall VM (only when data disks are backing up)
    6. Attach ISO
    7. Attach disk
    8. Detach disk (other than the one backing up)
    9. Change service offering

 

All the other VM/Volume operations not listed above are not supported.


  • No labels