Skip to main content

Replacing a Cisco Nexus 9336C-FX2 cluster switch

Replacing a defective Nexus 9336C-FX2 switch in a cluster network is a nondisruptive procedure (NDU).

The following conditions must exist before performing the switch replacement in the current environment and on the replacement switch.

  • Existing cluster and network infrastructure:
    • The existing cluster must be verified as completely functional, with at least one fully connected cluster switch.
    • All cluster ports must be up.
    • All cluster logical interfaces (LIFs) must be up and on their home ports.
    • The ONTAP cluster ping-cluster -node node1 command must indicate that basic connectivity and larger than PMTU communication are successful on all paths.
  • Nexus 9336C-FX2 replacement switch:
    • Management network connectivity on the replacement switch must be functional.
    • Console access to the replacement switch must be in place.
    • The node connections are ports 1/1 through 1/34.
    • All Inter-Switch Link (ISL) ports must be disabled on ports 1/35 and 1/36.
    • The desired reference configuration file (RCF) and NX-OS operating system image switch must be loaded onto the switch.
    • Initial customization of the switch must be complete, as detailed in:

      Configuring a new Cisco Nexus 9336C-FX2 switch

      Any previous site customizations, such as STP, SNMP, and SSH, should be copied to the new switch.

You must execute the command for migrating a cluster LIF from the node where the cluster LIF is hosted.

The examples in this procedure use the following switch and node nomenclature:

  • The names of the existing Nexus 9336C-FX2 switches are cs1 and cs2.
  • The name of the new Nexus 9336C-FX2 switch is newcs2.
  • The node names are node1 and node2.
  • The cluster ports on each node are named e0a and e0b.
  • The cluster LIF names are node1_clus1 and node1_clus2 for node1, and node2_clus1 and node2_clus2 for node2.
  • The prompt for changes to all cluster nodes is cluster1::*>
Note
The following procedure is based on the following cluster network topology:
cluster1::*> network port show -ipspace Cluster

Node: node1
Ignore
Speed(Mbps) Health Health
Port IPspace Broadcast Domain Link MTU Admin/Oper Status Status
--------- ------------ ---------------- ---- ---- ----------- -------- ------
e0a Cluster Cluster up 9000 auto/10000 healthy false
e0b Cluster Cluster up 9000 auto/10000 healthy false

Node: node2
Ignore
Speed(Mbps) Health Health
Port IPspace Broadcast Domain Link MTU Admin/Oper Status Status
--------- ------------ ---------------- ---- ---- ----------- -------- ------
e0a Cluster Cluster up 9000 auto/10000 healthy false
e0b Cluster Cluster up 9000 auto/10000 healthy false
4 entries were displayed.



cluster1::*> network interface show -vserver Cluster
Logical Status Network Current Current Is
Vserver Interface Admin/Oper Address/Mask Node Port Home
----------- ---------- ---------- ------------------ ------------- ------- ----
Cluster
node1_clus1 up/up 169.254.209.69/16 node1 e0a true
node1_clus2 up/up 169.254.49.125/16 node1 e0b true
node2_clus1 up/up 169.254.47.194/16 node2 e0a true
node2_clus2 up/up 169.254.19.183/16 node2 e0b true
4 entries were displayed.



cluster1::*> network device-discovery show -protocol cdp
Node/ Local Discovered
Protocol Port Device (LLDP: ChassisID) Interface Platform
----------- ------ ------------------------- ---------------- ----------------
node2 /cdp
e0a cs1 Eth1/2 N9K-C9336C
e0b cs2 Eth1/2 N9K-C9336C
node1 /cdp
e0a cs1 Eth1/1 N9K-C9336C
e0b cs2 Eth1/1 N9K-C9336C
4 entries were displayed.



cs1# show cdp neighbors

Capability Codes: R - Router, T - Trans-Bridge, B - Source-Route-Bridge
S - Switch, H - Host, I - IGMP, r - Repeater,
V - VoIP-Phone, D - Remotely-Managed-Device,
s - Supports-STP-Dispute

Device-ID Local Intrfce Hldtme Capability Platform Port ID
node1 Eth1/1 144 H dm5000h e0a
node2 Eth1/2 145 H dm5000h e0a
cs2 Eth1/35 176 R S I s N9K-C9336C Eth1/35
cs2(FDO220329V5) Eth1/36 176 R S I s N9K-C9336C Eth1/36

Total entries displayed: 4


cs2# show cdp neighbors

Capability Codes: R - Router, T - Trans-Bridge, B - Source-Route-Bridge
S - Switch, H - Host, I - IGMP, r - Repeater,
V - VoIP-Phone, D - Remotely-Managed-Device,
s - Supports-STP-Dispute

Device-ID Local Intrfce Hldtme Capability Platform Port ID
node1 Eth1/1 139 H dm5000h e0b
node2 Eth1/2 124 H dm5000h e0b
cs1 Eth1/35 178 R S I s N9K-C9336C Eth1/35
cs1 Eth1/36 178 R S I s N9K-C9336C Eth1/36

Total entries displayed: 4

  1. If AutoSupport is enabled on this cluster, suppress automatic case creation by invoking an AutoSupport message: system node autosupport invoke -node * -type all -message MAINT=xh

    In MAINT=x h, x is the duration of the maintenance window in units of hours.

    Note
    The AutoSupport message notifies technical support of this maintenance task so that automatic case creation is suppressed during the maintenance window.
  2. Optional: Install the appropriate RCF and image on the switch, newcs2, and make any necessary site preparations.
    If necessary, verify, download, and install the appropriate versions of the RCF and NX-OS software for the new switch. If you have verified that the new switch is correctly set up and does not need updates to the RCF and NX-OS software, continue to step 3.
    1. Go to the DM Series Cluster Switch Reference Configuration File (RCF) Description Page.
    2. Download the correct RCF file for the version of ONTAP software you are installing.
  3. On the new switch, log in as admin and shut down all of the ports that will be connected to the node cluster interfaces (ports 1/1 to 1/34).
    If the switch that you are replacing is not functional and is powered down, go to Step 4. The LIFs on the cluster nodes should have already failed over to the other cluster port for each node.
    newcs2# config
    Enter configuration commands, one per line. End with CNTL/Z.
    newcs2(config)# interface e1/1-34
    newcs2(config-if-range)# shutdown

  4. Verify that all cluster LIFs have auto-revert enabled: network interface show -vserver Cluster -fields auto-revert
    cluster1::> network interface show -vserver Cluster -fields auto-revert

    Logical
    Vserver Interface Auto-revert
    ------------ ------------- -------------
    Cluster node1_clus1 true
    Cluster node1_clus2 true
    Cluster node2_clus1 true
    Cluster node2_clus2 true

    4 entries were displayed.

  5. Verify that all the cluster LIFs can communicate: cluster ping-cluster
    cluster1::*> cluster ping-cluster node1

    Host is node2
    Getting addresses from network interface table...
    Cluster node1_clus1 169.254.209.69 node1 e0a
    Cluster node1_clus2 169.254.49.125 node1 e0b
    Cluster node2_clus1 169.254.47.194 node2 e0a
    Cluster node2_clus2 169.254.19.183 node2 e0b
    Local = 169.254.47.194 169.254.19.183
    Remote = 169.254.209.69 169.254.49.125
    Cluster Vserver Id = 4294967293
    Ping status:
    ....
    Basic connectivity succeeds on 4 path(s)
    Basic connectivity fails on 0 path(s)
    ................
    Detected 9000 byte MTU on 4 path(s):
    Local 169.254.47.194 to Remote 169.254.209.69
    Local 169.254.47.194 to Remote 169.254.49.125
    Local 169.254.19.183 to Remote 169.254.209.69
    Local 169.254.19.183 to Remote 169.254.49.125
    Larger than PMTU communication succeeds on 4 path(s)
    RPC status:
    2 paths up, 0 paths down (tcp check)
    2 paths up, 0 paths down (udp check)

  6. Shut down the ISL ports 1/35 and 1/36 on the Nexus 9336C-FX2 switch cs1:
    cs1# configure 
    Enter configuration commands, one per line. End with CNTL/Z.
    cs1(config)# interface e1/35-36
    cs1(config-if-range)# shutdown
    cs1(config-if-range)#

  7. Remove all of the cables from the Nexus 9336C-FX2 cs2 switch, and then connect them to the same ports on the Nexus C9336C-FX2 newcs2 switch.
  8. Bring up the ISLs ports 1/35 and 1/36 between the cs1 and newcs2 switches, and then verify the port channel operation status.
    Port-Channel should indicate Po1(SU) and Member Ports should indicate Eth1/35(P) and Eth1/36(P).

    This example enables ISL ports 1/35 and 1/36 and displays the port channel summary on switch cs1:

    cs1# configure
    Enter configuration commands, one per line. End with CNTL/Z.
    cs1(config)# int e1/35-36
    cs1(config-if-range)# no shutdown

    cs1(config-if-range)# show port-channel summary
    Flags: D - Down P - Up in port-channel (members)
    I - Individual H - Hot-standby (LACP only)
    s - Suspended r - Module-removed
    b - BFD Session Wait
    S - Switched R - Routed
    U - Up (port-channel)
    p - Up in delay-lacp mode (member)
    M - Not in use. Min-links not met
    --------------------------------------------------------------------------------
    Group Port- Type Protocol Member Ports
    Channel
    --------------------------------------------------------------------------------
    1 Po1(SU) Eth LACP Eth1/35(P) Eth1/36(P)

    cs1(config-if-range)#

  9. Verify that port e0b is up on all nodes: network port show ipspace Cluster

    The output should be similar to the following:

    cluster1::*> network port show -ipspace Cluster

    Node: node1
    Ignore
    Speed(Mbps) Health Health
    Port IPspace Broadcast Domain Link MTU Admin/Oper Status Status
    --------- ------------ ---------------- ---- ----- ----------- -------- -------
    e0a Cluster Cluster up 9000 auto/10000 healthy false
    e0b Cluster Cluster up 9000 auto/10000 healthy false

    Node: node2
    Ignore
    Speed(Mbps) Health Health
    Port IPspace Broadcast Domain Link MTU Admin/Oper Status Status
    --------- ------------ ---------------- ---- ----- ----------- -------- -------
    e0a Cluster Cluster up 9000 auto/10000 healthy false
    e0b Cluster Cluster up 9000 auto/auto - false

    4 entries were displayed.
  10. On the same node you used in the previous step, revert the cluster LIF associated with the port in the previous step by using the network interface revert command.
    In this example, LIF node1_clus2 on node1 is successfully reverted if the Home value is true and the port is e0b.

    The following commands return LIF node1_clus2 on node1 to home port e0a and displays information about the LIFs on both nodes. Bringing up the first node is successful if the Is Home column is true for both cluster interfaces and they show the correct port assignments, in this example e0a and e0b on node1.

    cluster1::*> network interface show -vserver Cluster

    Logical Status Network Current Current Is
    Vserver Interface Admin/Oper Address/Mask Node Port Home
    ----------- ------------ ---------- ------------------ ---------- ------- -----
    Cluster
    node1_clus1 up/up 169.254.209.69/16 node1 e0a true
    node1_clus2 up/up 169.254.49.125/16 node1 e0b true
    node2_clus1 up/up 169.254.47.194/16 node2 e0a true
    node2_clus2 up/up 169.254.19.183/16 node2 e0a false

    4 entries were displayed.

  11. Display information about the nodes in a cluster: cluster show

    This example shows that the node health for node1 and node2 in this cluster is true:

    cluster1::*> cluster show

    Node Health Eligibility
    ------------- ------- ------------
    node1 false true
    node2 true true

  12. Verify that all physical cluster ports are up: network port show ipspace Cluster
    cluster1::*> network port show -ipspace Cluster

    Node node1 Ignore
    Speed(Mbps) Health Health
    Port IPspace Broadcast Domain Link MTU Admin/Oper Status Status
    --------- ----------- ----------------- ----- ----- ----------- -------- ------
    e0a Cluster Cluster up 9000 auto/10000 healthy false
    e0b Cluster Cluster up 9000 auto/10000 healthy false

    Node: node2
    Ignore
    Speed(Mbps) Health Health
    Port IPspace Broadcast Domain Link MTU Admin/Oper Status Status
    --------- ------------ ---------------- ----- ----- ----------- -------- ------
    e0a Cluster Cluster up 9000 auto/10000 healthy false
    e0b Cluster Cluster up 9000 auto/10000 healthy false

    4 entries were displayed.

  13. Verify that all the cluster LIFs can communicate: cluster ping-cluster
    cluster1::*> cluster ping-cluster -node node2
    Host is node2
    Getting addresses from network interface table...
    Cluster node1_clus1 169.254.209.69 node1 e0a
    Cluster node1_clus2 169.254.49.125 node1 e0b
    Cluster node2_clus1 169.254.47.194 node2 e0a
    Cluster node2_clus2 169.254.19.183 node2 e0b
    Local = 169.254.47.194 169.254.19.183
    Remote = 169.254.209.69 169.254.49.125
    Cluster Vserver Id = 4294967293
    Ping status:
    ....
    Basic connectivity succeeds on 4 path(s)
    Basic connectivity fails on 0 path(s)
    ................
    Detected 9000 byte MTU on 4 path(s):
    Local 169.254.47.194 to Remote 169.254.209.69
    Local 169.254.47.194 to Remote 169.254.49.125
    Local 169.254.19.183 to Remote 169.254.209.69
    Local 169.254.19.183 to Remote 169.254.49.125
    Larger than PMTU communication succeeds on 4 path(s)
    RPC status:
    2 paths up, 0 paths down (tcp check)
    2 paths up, 0 paths down (udp check)
  14. Confirm the following cluster network configuration: network port show
    cluster1::*> network port show -ipspace Cluster
    Node: node1
    Ignore
    Speed(Mbps) Health Health
    Port IPspace Broadcast Domain Link MTU Admin/Oper Status Status
    --------- ----------- ---------------- ---- ----- ----------- -------- ------
    e0a Cluster Cluster up 9000 auto/10000 healthy false
    e0b Cluster Cluster up 9000 auto/10000 healthy false

    Node: node2
    Ignore
    Speed(Mbps) Health Health
    Port IPspace Broadcast Domain Link MTU Admin/Oper Status Status
    --------- ------------ ---------------- ---- ---- ----------- -------- ------
    e0a Cluster Cluster up 9000 auto/10000 healthy false
    e0b Cluster Cluster up 9000 auto/10000 healthy false

    4 entries were displayed.


    cluster1::*> network interface show -vserver Cluster

    Logical Status Network Current Current Is
    Vserver Interface Admin/Oper Address/Mask Node Port Home
    ----------- ---------- ---------- ------------------ ------------- ------- ----
    Cluster
    node1_clus1 up/up 169.254.209.69/16 node1 e0a true
    node1_clus2 up/up 169.254.49.125/16 node1 e0b true
    node2_clus1 up/up 169.254.47.194/16 node2 e0a true
    node2_clus2 up/up 169.254.19.183/16 node2 e0b true

    4 entries were displayed.

    cluster1::> network device-discovery show -protocol cdp

    Node/ Local Discovered
    Protocol Port Device (LLDP: ChassisID) Interface Platform
    ----------- ------ ------------------------- ---------------- ----------------
    node2 /cdp
    e0a cs1 0/2 N9K-C9336C
    e0b newcs2 0/2 N9K-C9336C
    node1 /cdp
    e0a cs1 0/1 N9K-C9336C
    e0b newcs2 0/1 N9K-C9336C

    4 entries were displayed.


    cs1# show cdp neighbors

    Capability Codes: R - Router, T - Trans-Bridge, B - Source-Route-Bridge
    S - Switch, H - Host, I - IGMP, r - Repeater,
    V - VoIP-Phone, D - Remotely-Managed-Device,
    s - Supports-STP-Dispute

    Device-ID Local Intrfce Hldtme Capability Platform Port ID
    node1 Eth1/1 144 H dm5000h e0a
    node2 Eth1/2 145 H dm5000h e0a
    newcs2 Eth1/35 176 R S I s N9K-C9336C Eth1/35
    newcs2 Eth1/36 176 R S I s N9K-C9336C Eth1/36

    Total entries displayed: 4


    newcs2# show cdp neighbors

    Capability Codes: R - Router, T - Trans-Bridge, B - Source-Route-Bridge
    S - Switch, H - Host, I - IGMP, r - Repeater,
    V - VoIP-Phone, D - Remotely-Managed-Device,
    s - Supports-STP-Dispute

    Device-ID Local Intrfce Hldtme Capability Platform Port ID
    node1 Eth1/1 139 H dm5000h e0b
    node2 Eth1/2 124 H dm5000h e0b
    cs1 Eth1/35 178 R S I s N9K-C9336C Eth1/35
    cs1 Eth1/36 178 R S I s N9K-C9336C Eth1/36

    Total entries displayed: 4

  15. For ONTAP 9.8 and later, enable the Ethernet switch health monitor log collection feature for collecting switch-related log files, using the commands: system switch ethernet log setup-passwordsystem switch ethernet log enable-collection
    cluster1::*> system switch ethernet log setup-password
    Enter the switch name: <return>
    The switch name entered is not recognized.
    Choose from the following list:
    cs1
    cs2

    cluster1::*> system switch ethernet log setup-password

    Enter the switch name: cs1
    RSA key fingerprint is e5:8b:c6:dc:e2:18:18:09:36:63:d9:63:dd:03:d9:cc
    Do you want to continue? {y|n}::[n] y

    Enter the password: <enter switch password>
    Enter the password again: <enter switch password>

    cluster1::*> system switch ethernet log setup-password

    Enter the switch name: cs2
    RSA key fingerprint is 57:49:86:a1:b9:80:6a:61:9a:86:8e:3c:e3:b7:1f:b1
    Do you want to continue? {y|n}:: [n] y

    Enter the password: <enter switch password>
    Enter the password again: <enter switch password>

    cluster1::*> system switch ethernet log enable-collection

    Do you want to enable cluster log collection for all nodes in the cluster?
    {y|n}: [n] y

    Enabling cluster switch log collection.

    cluster1::*>

    Note
    If any of these commands return an error, contact Lenovo technical support.
  16. For ONTAP releases 9.5P16, 9.6P12, and 9.7P10 and later patch releases, enable the Ethernet switch health monitor log collection feature for collecting switch-related log files, using the commands: system cluster-switch log setup-passwordsystem cluster-switch log enable-collection
    cluster1::*> system cluster-switch log setup-password
    Enter the switch name: <return>
    The switch name entered is not recognized.
    Choose from the following list:
    cs1
    cs2

    cluster1::*> system cluster-switch log setup-password

    Enter the switch name: cs1
    RSA key fingerprint is e5:8b:c6:dc:e2:18:18:09:36:63:d9:63:dd:03:d9:cc
    Do you want to continue? {y|n}::[n] y

    Enter the password: <enter switch password>
    Enter the password again: <enter switch password>

    cluster1::*> system cluster-switch log setup-password

    Enter the switch name: cs2
    RSA key fingerprint is 57:49:86:a1:b9:80:6a:61:9a:86:8e:3c:e3:b7:1f:b1
    Do you want to continue? {y|n}:: [n] y

    Enter the password: <enter switch password>
    Enter the password again: <enter switch password>

    cluster1::*> system cluster-switch log enable-collection

    Do you want to enable cluster log collection for all nodes in the cluster?
    {y|n}: [n] y

    Enabling cluster switch log collection.

    cluster1::*>

    Note
    If any of these commands return an error, contact Lenovo technical support.
  17. If you suppressed automatic case creation, re-enable it by invoking an AutoSupport message: system node autosupport invoke -node * -type all -message MAINT=END