I wonder if someone has already extensively written about this. Even better if Microsoft can disclaim my concern, but from what I could read between the lines the above mentioned limitation can turn out to be true.
To start the other way around: I recently was browsing multiple Microsoft sites full with marketing promises about System Center products. Concerning virtualization, SC Essentials 2010 claims to provide support for up to 50 servers and 500 clients (enough for a private cloud I would guess); and genuine datacenter product SCVMM 2012 can support more than 150 VM hosts. So we are hitting the high speed road here: we purchase 100 hardware systems and run for IaaS deployment using Server 2008 R2 of course, because of its live migration and high availability “all you can eat”-promises. Then again, let’s take a closer look at the specifications:
Live migration requires failover clustering enabled and configured on the host plus the necessary SAN equipment providing the cluster shared volume(s). Here the link to the notorious new feature “Hyper-V: Using Live Migration with Cluster Shared Volumes in Windows Server 2008 R2” – http://technet.microsoft.com/en-us/library/dd446679(WS.10).aspx
So far no problems, I would say everything is under control, but what we really need is not manual migration – this happens quite rare judging from my experience – but rather high availability in case that a virtualization host, or rather in Microsoft Failover Cluster-parlance “cluster node” goes down. And there is already a TechNet article named “Hyper-V: Using Hyper-V and Failover Clustering” – http://technet.microsoft.com/en-us/library/cc732181(WS.10).aspx
What really disturbs the glorious voyage there are two notes, the first stating: “Do not to intentionally shut down a node while a virtual machine is running on the node.” To this I can only say “Hey, but that is what exactly I need to protect my virtual machines from!” Funny isn’t it? I’m just itching to try it out, but don’t have enough resource for a test lab right now. The second note is actually what the article’s headline is about: “A maximum number of 16 nodes in the failover cluster are allowed.” Aha, I can remember that from my exams in MSCA 2003. The question is what happens with our 100+ cluster nodes IaaS datacenter in such case? Well the obvious logical answer would be to create many host groups, or as in SCVMM 2012 we call them Services (within the private cloud), as such providing small independent islands. But this still doesn’t answer the question “What if I want to migrate a certain virtual machine from host-1 to host-17, the latter obviously in another island?”
I would be glad to hear a reasonable answer.