Compute node or hypervisor high availability

Each ThinkAgile CP compute node, storage controller, and network interconnect communicates with the management portal every 60 seconds, sending regular metadata updates on statistical information, such as CPU usage, memory usage, and power usage by node and application instance.

The portal can also use this to compute usage by virtual datacenter. The statistical information is aggregated and averaged to provide data points for graphical display. Information reported by a storage controller, on a per compute node basis, includes the iSCSI active session counts. The portal detects a node failure when all the iSCSI active session counts for a particular node go to zero. It can then initiate the action to restart application instances from that node onto one or more other nodes in the same migration zone that satisfy the compute constraints. This is called hypervisor high availability (HA).

Give feedback