Verifying healing and manual switchback

You can test the healing and manual switchback operations to verify that data availability is not affected (except for SMB configurations) by switching back the cluster to the original data center after a negotiated switchover.

About this task

This test should take about 30 minutes.

The expected result of this procedure is that services should be switched back to their home nodes.

The healing steps are not required on systems running ONTAP 9.5 or later, on which healing is performed automatically after a negotiated switchover. On systems running ONTAP 9.6 and later, healing is also performed automatically after unscheduled switchover.

If the system is running ONTAP 9.4 or earlier, heal the data aggregate: metrocluster heal aggregates
Example
The following example shows the successful completion of the command:
cluster_A::> metrocluster heal aggregates [Job 936] Job succeeded: Heal Aggregates is successful.
If the system is running ONTAP 9.4 or earlier, heal the root aggregate: metrocluster heal root-aggregates
This step is required on the following configurations:
- MetroCluster FC configurations.
- MetroCluster IP configurations running ONTAP 9.4 or earlier.
Example
The following example shows the successful completion of the command:
cluster_A::> metrocluster heal root-aggregates [Job 937] Job succeeded: Heal Root Aggregates is successful.

Verify that healing is completed: metrocluster node show

Example

The following example shows the successful completion of the command:

cluster_A::> metrocluster node show
DR                               Configuration  DR
Group Cluster Node               State          Mirroring Mode
----- ------- ------------------ -------------- --------- --------------------
1     cluster_A
              node_A_1         configured     enabled   heal roots completed
      cluster_B
              node_B_2         unreachable    -         switched over
42 entries were displayed.metrocluster operation show

Example

If the automatic healing operation fails for any reason, you must issue the metrocluster heal commands manually as done in ONTAP versions prior to ONTAP 9.5. You can use the metrocluster operation show and metrocluster operation history show -instance commands to monitor the status of healing and determine the cause of a failure.

Verify that all aggregates are mirrored: storage aggregate show

Example

The following example shows that all aggregates have a RAID Status of mirrored:

cluster_A::> storage aggregate show
cluster Aggregates:
Aggregate Size     Available Used% State   #Vols  Nodes       RAID Status
--------- -------- --------- ----- ------- ------ ----------- ------------
data_cluster
            4.19TB    4.13TB    2% online       8 node_A_1    raid_dp,
                                                              mirrored,
                                                              normal
root_cluster
           715.5GB   212.7GB   70% online       1 node_A_1    raid4,
                                                              mirrored,
                                                              normal
cluster_B Switched Over Aggregates:
Aggregate Size     Available Used% State   #Vols  Nodes       RAID Status
--------- -------- --------- ----- ------- ------ ----------- ------------
data_cluster_B
            4.19TB    4.11TB    2% online       5 node_A_1    raid_dp,
                                                              mirrored,
                                                              normal
root_cluster_B    -         -     - unknown      - node_A_1   -

Boot nodes from the disaster site.

Check the status of switchback recovery: metrocluster node show

Example

cluster_A::> metrocluster node show
DR                               Configuration  DR
Group Cluster Node               State          Mirroring Mode
----- ------- ------------------ -------------- --------- --------------------
1     cluster_A
             node_A_1            configured     enabled   heal roots completed
      cluster_B
             node_B_2            configured     enabled   waiting for switchback
                                                          recovery
2 entries were displayed.

Perform the switchback: metrocluster switchback

Example

cluster_A::> metrocluster switchback 
[Job 938] Job succeeded: Switchback is successful.Verify switchback

Confirm status of the nodes: metrocluster node show

Example

cluster_A::> metrocluster node show
DR                               Configuration  DR
Group Cluster Node               State          Mirroring Mode
----- ------- ------------------ -------------- --------- --------------------
1     cluster_A
              node_A_1         configured     enabled   normal
      cluster_B
              node_B_2         configured     enabled   normal

2 entries were displayed.

Confirm status of the metrocluster operation: metrocluster operation show

Example

The output should show a successful state.

cluster_A::> metrocluster operation show
  Operation: switchback
      State: successful
 Start Time: 2/6/2016 13:54:25
   End Time: 2/6/2016 13:56:15
     Errors: -

Give documentation feedback