A couple weeks ago there was a community post about a possible issue around vCloud Director and the I/O load of the clones hitting all one ESX host. Duncan Epping and I did a little investigation on this and we discovered that this is not a vCloud Director issue so much as it is a function of vSphere’s cloning process. First we need to understand a couple of things.
- vApp templates in vCloud Director are not stored in vSphere as “Template”, they are simply a powered off Virtual Machine.
- Powered off Virtual Machines are registered to any particular host that was last running them.
- DRS will only move powered on Virtual Machines based on the load based algorithm except during a Maintenance Mode process you have the option to move powered off virtual machines.
- You can manually migrate powered off virtual machines
- When a clone happens on a given host, if the target location is a datastore seen by that host it will be a block level copy, if the target is a host without access to the source files, this will usually result in a network copy over the management interface. (This will be investigated in Part 2 further)
- The host that owns the registered virtual machine will be the one that performs the I/O workload of the clone
The last bullet is the key issue. Essentially a possibility exists that multiple vApp templates could live on only a few hosts. Now, that being said we should also assume that the vApp template was at one point powered up, and DRS properly powered them up on different hosts. Once powered off they will remain on the last known host until they are manually moved, or a Maintenance Mode triggers them to be moved.
Now there also is the possibility that over multiple maintenance modes, the powered off Virtual Machines could end up on only a couple hosts. When consumers deploy a vApp this process would then happen on those hosts dragging them down. The other issue is that if you are a provider and consumers deploy the SAME vApp template, the host where that vApp is registered will handle most of the load for all those deployments. So how do we work around this for now?
Alan Renouf was kind enough to provide a PowerCLI script to balance out the powered off Virtual Machines across the cluster. This can solve the issue where too many powered off Virtual Machines end up on the same host or group of hosts. The script can be found below and will balance out the powered off Virtual Machines in a given cluster.
Script 1 – lists the hosts in the clusters and the number of powered off VMs on them so you can see if this is relevant
Get-Cluster | Sort Name | Foreach { Write-Host "Cluster: $_" $_ | Get-VMHost | Sort Name | Select Name, @{N="NumVMPoweredOff";E={@($_ | Get-VM | Where {$_.PowerState -eq "PoweredOff"}).Count}} }
Script 2 – moves the VMs equally among the hosts – note it does assume networks and datastores are the same on all hosts in a cluster.
Get-Cluster | Foreach { Write "Balancing Cluster: $($_.Name)" $HostsinCluster = @($_ | Get-VMHost | Sort Name) $numberofhosts = $HostsinCluster.Count $hostnumber = 0 $_ | Get-VMHost | Get-VM | Where { $_.PowerState -eq "PoweredOff" } | Foreach { $MoveHost = $HostsinCluster[$hostnumber] if ($_.VMHost -eq $MoveHost) { Write-Host "Leaving $($_) on $MoveHost" } Else { Write-Host "Moving $($_) to $MoveHost" Move-VM -VM $_ -Destination $MoveHost -Confirm:$false } If ($hostnumber -eq ($numberofhosts -1)) { $hostnumber = 0 } Else { $hostnumber++ } } }
This takes care of the original issue of a single or small group of hosts getting all the clone I/O workload, however in Part 2 of this post I will examine a slightly different more interesting twist to this. I am in the process of reconfiguring my lab so we can see the deeper affects of the clone wars in vCloud Director. The question there really is what happens regardless of balance if 100 people all deploy the SAME vApp template? Is there a way to mitigate that I/O in the design of vSphere under the covers to not adversely affect the other Virtual Machines running on that same host.
Wow, nice find I will definitely have to show this to my team at work and get it rolling right away.
thanks for sharing this… I can see how it could be impacting.
Do you think VMWare may add some kind of balancing mecahnism to vCloud in a later release? Seems sensible.
We have gone back to engineering to ask if they can simply modify the clone process to pick either a random host, or better yet pick the host with the most resources to do the clone. Obviously that is a pretty big ask so it may take some time. In the interim we can do things like PowerCLI to work around it.
I’m a little surprised that VMware Engineers didn’t think about this. Maintenance mode and VM portability are extremely useful tools. As a result, polarization of workloads and resource utilization can occur. It’s not that much of a stretch.
Perhaps vApp templates should actually BE ‘Templates’ so that they DON’T move via DRS or maintenance mode. This is an operational annoyance I’ve complained to VMware about for a long time with no results which would, in a way, benefit this scenario.
Jas
I don’t think that would completely solve the I/O load issue. Even a VMTX template is still owned and registered by an ESX host. I don’t think it would be any different as template files are still “Powered Off” and I believe are still moved as part of a MM operation if you elect to move powered off VM’s. If not they would be completely un-deployable while that host is in MM. The root issue is not the moves, but the traffic generated for the copy process all hitting the same host based on the host that owns the registered VM. Right now it is just something to be aware of as you are deploying vApps.
Any ill effects from moving around VM hosts underneath vCloud Director? i.e. Should we be using the vCloud API instead of vSphere?
Define “ill Effects”? You can use the vCloud API to re-locate Virtual Machines and that is the preferred method. Mainly if you tried to storage VMotion something in vCenter, and the datastore was not in use by vCD, bad things can happen. This is one reason why for now we disable Storage DRS in vSphere with vCD 1.5. Since the vCloud API give you the commands to relocate you should use it there. Of course on a powered on Virtual Machine the host running it will handle the I/O load.
Ill effects meaning, is there anything that vCloud stores in it’s database that keeps track/caches what host it’s on. For instance: Console Proxy.
Not that I know of. If so then a VMotion would break the console connection more than a storage move.
What about VAAI ? If the storage supports it, at least cloning inside the same storage system should not put a burden on the ESX host, correct ?
That is an excellent point when dealing with a cluster or group of clusters where the storage array supports VAAI. However, if you read on to Part 2 and 3 you will see that once you leave the cluster the only way vCD/vSphere can move files to and from systems once there is no shared storage is via network copy as well as import/export through vCD. Yes VAAI would certainly help with the single vCenter Scenario, EXCEPT across clusters. Unless those clusters have the catalog items on a shared LUN that is VAAI enabled. That would be a good consideration in Part 3 I can add.
However, again it will not help once vCD needs to deploy a vApp from one vCenter to another vCenter in the same cloud instance. Since users almost always deploy from a shared catalog those catalog items are assumed to be located in a single original location and then deployed from there. I agree that in the very simple single vCenter scenario, VAAI helps within a single cluster, but may still not help even with a single vCenter and separate compute clusters unless the catalog Datastore is shared between them, which is in many designs not the case. Really most designs at scale would have more than one provider vDC and I have seen none with a shared catalog Datastore. In that case which Provider the catalog is on and where it is going still needs to be looked at.
Yes, I read part 2 and 3, and it is clear VAAI usage is limited.
But a single vCenter scales to nearly 10k VMs, and a LUN dedicated to templates, visible to all clusters , is indeed a good thing to consider (assuming we are still below the config maximums).
VAAI will also not help with multiple storage systems, even on the same vCenter/Cluster.
Yes those are good points too. However I already know of one customer moving to an additional vCenter not for capacity, but for security requirements of the deployment. Also I know another customer with a single vCloud and two vCenters hosted in separate campus buildings. So it is happening for different reasons and there will be people doing it for multi-site designs, and other reasons besides going over 10,000 Virtual Machines since I have seen it already
At the very least you have started thinking about this which was the point of the experiment in the lab and writing it all up. I may try to diagram a possible at scale diagram time permitting showing a shared Catalog datastore, hosted on a dedicated Catalog vDC. I need a bigger lab though 🙂
Thanks for the feedback. I added VAAI to Part 3 as a consideration at least in the same vCenter instance.