With the current code for snapshots, cloudStack always creates snapshot on the host where vm is Running (for vms in Running state) or on the host where vm used to run the last time (for vms in Stopped state). As the commands are not synchronized on the agent side, the case when multiple commands are send to the backend at the same time can lead to the performance issues on the hypervisor side. At the end there is a high possibility that createSnapshot command might time out on the Xen side_._
The solution is to synchronize number of concurrent snapshots per host basis. The threshold should be configurable as the customer usually knows how many snapshots at a time the backend can handle. While the concurrent snapshots are being processed by the backend, all subsequent snapshot commands scheduled for execution on the same host, should wait in the queue. And the wait timeout should also be a configurable value.
Admin knows that his backend host can process 10 snapshots at a time with the current vms load. So he sets the global configuration concurrent.snapshots.threshold.perhost to 10 and configures the desired expiration time for the create snapshot job to wait for execution.
sync_queue table
sync_queue_item table
Once the createSnapshot operation is requested either by the user via API, or by cloudStack snapshot scheduler, the code:
1) Determines the host information by a) retrieving the vm information from the volume. b) checking host_id for the vm. If not null, this would be the target host. If null, check last_host_id. If not null, the last_host_id is the targeted host. If both fields are null, then we don't do any synchronization - in most of the cases cloudStack won't experience it.
2) Once the host information is available, createSnapshot async job is scheduled and synced on the host object. If the number of current createSnapshot jobs processed against this host, is less than concurrent.snapshots.threshold.perhost allows, then the job is dequeued from the sync_queue_item table and sent to the backend.Once the job is dequeued, the sync_queue.queue_size is incremented by one.
3) When the createSnapshot is purged from sync_queue_item table (means it's completed on the backend, and result is updated in async job table), the sync_queue.queue_size field is decremented by one. And the next createSnapshot job standing in the queue for the same host, will start its execution.
4) If the createSnapshot job waits for more than job.expire.minutes for execution, then it gets expired and failure is returned to the API caller.
No new APIs are added. Existing createSnapshot API now going to be synchronized on the Host object, but no new parameters are exposed to the end user.
No UI changes for the feature.