Best practices for HA pairs
To help your HA pair to be robust and operational, you must be familiar with configuration best practices.
You must not use the root aggregate for storing data.
Storing user data in the root aggregate adversely affects system stability and increases the storage failover time between nodes in an HA pair.
You must verify that each power supply unit in the storage system is on a different power grid so that a single power outage does not affect all power supply units.
You must use LIFs (logical interfaces) with defined failover policies to provide redundancy and improve availability of network communication.
Keep both nodes in the HA pair on the same version of ONTAP.
Follow the documented procedures when upgrading your HA pair.
Refer to Upgrade, Revert or Downgrade Guide.
You must verify that you maintain a consistent configuration between the two nodes.
An inconsistent configuration is often the cause of failover problems.
You must verify that you test the failover capability routinely (for example, during planned maintenance) to verify proper configuration.
You must verify that each node has sufficient resources to adequately support the workload of both nodes during takeover mode.
- If your system supports remote management (through a Service Processor), you must configure it properly.
Refer to System Administration Guide.
You must verify that you follow recommended limits for FlexVol volumes, dense volumes, Snapshot copies, and LUNs to reduce takeover or giveback time.
When adding FlexVol volumes to an HA pair, you should consider testing the takeover and giveback times to verify that they fall within your requirements.
For systems using disks, ensure that you check for failed disks regularly and remove them as soon as possible.
Failed disks can extend the duration of takeover operations or prevent giveback operations.
Refer to Disk and Aggregate Management Guide .
- Multipath HA connection is required on all HA pairs.
To receive prompt notification if the takeover capability becomes disabled, you should configure your system to enable automatic email notification for the
takeover impossible EMS messages:- ha.takeoverImpVersion
- ha.takeoverImpLowMem
- ha.takeoverImpDegraded
- ha.takeoverImpUnsync
- ha.takeoverImpIC
- ha.takeoverImpHotShelf
- ha.takeoverImpNotDef
Avoid using the
-only-cfo-aggregates parameter with the storage failover giveback command.