What is the likely reason virtual machines fail to restart?

A host in a 4-node HA\DRS cluster has failed. HA initiates restart tasks for the virtual machines on
the remaining hosts in the cluster but half of them fail to restart.
What is the likely reason virtual machines fail to restart?

A host in a 4-node HA\DRS cluster has failed. HA initiates restart tasks for the virtual machines on
the remaining hosts in the cluster but half of them fail to restart.
What is the likely reason virtual machines fail to restart?

A.
HA admission control is blocking the restart of the virtual machines due to insufficient resources
to support the configured failover level of the cluster.

B.
The vMotion network has failed on the remaining hosts preventing the virtual machines from
migrating.

C.
Storage vMotion has failed on the datastore containing the virtual machines preventing them
from migrating.

D.
The failed host was the Master host in the cluster and no other hosts are able to perform the
necessary HA function to restart the virtual machines.

Explanation:



Leave a Reply 9

Your email address will not be published. Required fields are marked *


babar.munir

babar.munir

A is correct

vCenter Server uses admission control to ensure that sufficient resources are available in a cluster to provide failover protection and to ensure that virtual machine resource reservations are respected.

http://pubs.vmware.com/vsphere-50/index.jsp?topic=%2Fcom.vmware.vsphere.avail.doc_50%2FGUID-53F6938C-96E5-4F67-9A6E-479F5A894571.html

Bart

Bart

You can rule-out B. vMotion DID work, only …half(!)… failed.
You can rule-out C. Storage vMotion DID work, only …half(!)… failed.
You can rule-out D. other hosts ARE able, only …half(!)… failed.

So A. must be the answer.

Rich

Rich

vMotion is not used for HA and Storage vMotion is not necessary for HA since it is the VMs that are moved, not the storage.

The answer is A.

RDGR

RDGR

Correction – The VMs are not moved. They are restarted.

Eduard

Eduard

Isn’t it strange that you enable admission control to ensure that sufficient capacity exists to deal with host failures only to find out that the very same mechanism prevents you from starting virtual machines when it is necessary to use this reserved capacity?

So answer A cannot be correct in a sense that HA admission control is preventing a restart of the VMs. To back up that statement, read Doctor HA’s blog on the subject:

http://www.yellow-bricks.com/2012/03/06/ha-admission-control-does-not-disallow-ha-initiated-restarts/

However, if the answer is read differently, in a sense that VMware HA admission control may have been improperly configured allowing the cluster’s failover capacity to have dropped below the capacity required to be able to restart all VMs, then answer A is correct. So, in my opinion, answer A should have read:

‘The way HA admission control has been configured is blocking the restart of the virtual machines…’

Got

Got

Totally agree, Your sentence is correct, but not the original one.

andp75

andp75

Very good point, Eduard, very good indeed. As correctly stated in the article you’re referring to, HA Admission Control applies to normal no-host-failure circumstances to start new VMs. In the case of the host failure event, the reserved capacity is used to re-start VMs from the failed host(s), plus whatever additional capacity might be needed by getting, for example, RAM from the running VMs on the surviving hosts through balloon reclaims and swapping.
To recap, Admission Control does not apply to VMs that must be restarted due to the host failure.

BUN

BUN

http://pubs.vmware.com/vsphere-50/index.jsp?topic=%2Fcom.vmware.vsphere.avail.doc_50%2FGUID-53F6938C-96E5-4F67-9A6E-479F5A894571.html
アドミッション コントロールでは、リソース使用率が制約され、その制約に違反するすべてのアクションが禁止されます。禁止されるアクションとしては、次のようなものがあります。
■仮想マシンのパワーオン。
■ホスト、クラスタ、リソース プールへの仮想マシンの移行。
■仮想マシンの CPU またはメモリ予約の増加。