Currently all VMs in CloudStack have not been guaranteed IOPS from the primary storage, especially shared storage. In some cases (such as deploying a large number of VMs), this could lead to slow performance and high boot-time of VM.
In CloudStack 4.3, all VMs deployed in XenServer always have a child image VHD stay on the same storage repository with parent image (master image or golden image). Thus, this page describe a new approach in order to increase VM performance by decrease total IOPS in primary storage. The method is simply move the parent image to a new high-IOPS primary storage, similar with XenServer Intellicache feature. This method can reduce storage cost by 30-40%, compare to 80% of Intellicache.
- 1 primary storage which have high IOPS (Pre-setup SSD or other SSD shared storage..) for storing parent image.
- 1 normal primary storage for ROOT volume.
- User uploads a template and marks it as a golden template.
- User creates new primary storage pointed to high-IOPS partition (shared or local) and marks it as golden primary storage.
- User deploys a bunch of VMs from a golden template, so all new VMs will have same parent image stored in golden primary storage.
Architecture and design description
Changes have to be performed on the Allocator, Manager and Orchestrator layer to planning and preparing for volume.
- Add golden PS property in StoragePool, VirtualMachineTemplate, PrimaryDataStoreTO, TemplateObjectTO class.
- DeploymentPlanningManager class:
New method planGoldenDeployment return DeployDestination for Golden image deployment.
Change method findSuitablePoolsForVolumes to find a suitable pool based on 3 types: golden, normal and up to all.
- DeployDestination class:
New property DeployDestination goldenDest link to golden deployment of parent image.
- VolumeOrchestrator class:
Change prepare method to check VM DeployDestination have a golden deployment, if yes we will clone from template on that golden deployment.
- VolumeService class:
New method CreateBaseImage
- XenServerStorageProcessor class:
New method copy_vhd_from_pool_to_pool to move a cloned ROOT volume from golden PS to ROOT PS.
- vmopsSnapshot plugin:
2 new plugin: setVhdParent and copy_vhd_from_pool_to_pool
- In case of migrating a volume, currently XenServer will merge VHD chain of VM volume and migrate it to a new primary storage. In this case, we can simply move child image to new storage repository by calling plugin copy_vhd_from_pool_to_pool. This plugin is reference to CloudStack XenServer plugin.
- In case of snapshot volume, it still follow CloudStack normal process.
- Migrate VM: must find a suitable host for live-migrate VM. If host cannot access the golden PS, it is added to the avoid list. (e.g: golden PS is local storage)
- Taking VM snapshot: still follow CloudStack normal process. Because VM snapshot automatically exploits the capabilities of the underlying storage repository in which the VM's disk images are stored, so it will reside on ROOT PS.
Web API changes
- No new APIs are added
- Add new parameter enableGoldenPs to API createStoragePool, updateStoragePool, registerTemplate, updateTemplate.
- New parameter enableGoldenPs is added to listStoragePools, listTemplates response.
- Add check box "Enable Golden Storage" in Add Primary Storage, Register Template Dialog UI.
- Add check box "Enable Golden Storage" in list detail view of Primary Storage and Template.
Feature limitations for 4.5 release
- Feature is not supported for Managed data stores
- Feature is supported only for primary storage created in the cluster of type XenServer
- Q: Will it be good to include bulk operation of this feature? In addition, does XenServer support parallel execution of these operations ?
A: Other than going through a "for" loop and deploying VM after VM, I don't think CloudStack currently supports a bulk-VM-deploy operation.
It would be nice if CS did so at some point in the future; however, that is probably a separate proposal.
Q: Since you are having problem with slow boot time of the vm's, will the booting of the vm's happen in golden PS, ie while cloning ? If so, the spawning of the vm's will be always fast. But I see you are starting the vm after moving the cloned vhd to the ROOT PS and pointing the child vhd to its parent vhd on the GOLDEN PS, hence , there will be a network traffic between these two primary storages, which will obviously slow down the vm's performance forever ?
A: Yes, VM will start in post-clone process and consume network traffic while booting. Based on the idea of XenServer Intellicache, while VM is booting and running, there will always be network traffic between shared storage (holding base image) and local storage (holding base image cache). But instead of copying partially base image from share storage to local storage and put all READ/WRITE IOPS to local, I *think* that my approach is a little bit easier to customize and can have performance better than Intellicache (cause all IOPS are divided into golden PS and normal PS)
Q: What if someone removes the golden primary storage containing the the parent VHD(template) where all the child VDH's in the root primary storage are been pointed to ? If so, all vm's running will be crashed immediately. since its child vhd's parent is removed.
A: Yes, all VMs running in this PS will be crashed, so I think this will become a condition to check if someone want to remove the golden primary storage. I will take note of that problem.
- Q: Tim Mackey challenge:
- XenServer hosts with multiple independent local storage are very rare. See this KB article covering how to create such storage: http://support.citrix.com/article/CTX121313
- By default local storage is LVM based, but to enable thin provisioning you'll want EXT3. See this blog for how to convert to EXT3: http://thinworldblog.blogspot.com/2011/08/enabling-thin-provisioning-on-existing.html
- It seems like you're planning on using Storage XenMotion to move the VHD from the golden primary storage to normal primary storage, but that's going to move the entire VHD chain and it will do so over the network. Here's a blog article describing a bit about how it works: http://blogs.citrix.com/2012/08/24/storage_xenmotion/. I'm reasonably certain the design parameters didn't include local->local without network.
A: Yes, I have been following this guide to attach a new SSD storage to XenServer host and currently I did not test with LVM storage repository, so I think I will give it a try this approach with LVM based repo. I did not use XenMotion to move VHD from golden PS to normal PS. Just a simply Linux cp command to avoid moving whole VHD chain. This idea refer to OpenStack Xapi plugins while importing VHD from staging area to SR.
- Q: If someone wants to take a snapshot of the VM, will that snapshot then got to normal secondary storage or back to the golden master?
A: The snapshot will go to normal secondary storage. I have tested it and have a few running VMs from same golden image in another SR. I will make the test case of live-migrate or migrate between pool and report to you soon.
- Q: We need to make sure that CloudStack does not delete your golden template in the background. As it stands today with XenServer, if a template resides on a primary storage and no VDI is referencing it, the template will eventually get deleted. We would need to make sure that - even though another VDI on another SR is referencing your golden template - it does not get deleted (i.e. that CloudStack understands not to delete the template due to this new use case). Also, the reverse should still work: if no VDI on any SR is referencing this template, the template should get deleted in a similar fashion to how this works today.
A: I have tested it and can be sure that CloudStack did not delete golden template in background and vice versa. Just clone VDI from any VM, move (copy) this VDI to another SR with another UUID name and point to parent image in diffrent SR. Start/Stop VM and XenServer did not delete both parent and child image.
- Q: Is it true that you are proposing that a given primary storage be dedicated to hosting only golden templates? In other words, it cannot also be used for traditional template/root disks?
A: Yes, because some high IOPS partitions like SSD will always have lower capacity compare to normal storage solutions. So I think golden primary storage should be dedicated to store only golden templates.
- Q: We need to understand how this new model impacts storage tagging, if at all.
A: The storage tag field could still be employed and it would be in reference to the primary storage that houses the root disks (and VM snapshots)...not in reference to the golden primary storage that is used to house the golden templates.
When executing a Compute Offering with a storage tag of, say, XYZ, the orchestration logic would have to find a primary storage tagged as XYZ that is accessible to a given host AND that host would have to also be able to access a golden primary storage where the golden image could be placed (specifying a storage tag or tags for a golden primary storage probably would not be useful).
- Q: The copy_vhd_from_secondarystorage XenServer plug-in is not used when you're using XenServer + XS62ESP1 + XS62ESP1004. In that case, please refer to copyTemplateToPrimaryStorage(CopyCommand) method in the Xenserver625StorageProcessor class.
- Q: I assume you are using MCS for you golden image? What version of XD? Given you are using pooled desktops, have you thought about using a PVS BDM iso and mount it with in your 1000 VMs? This way you can stagger reboots via PVS console or Studio. This would require a change to your delivery group.
A: No, this approach does not require MCS, XD, PVS or anything related to XenDesktop.