How to use VAST vNetMap

Prev Next

Purpose

This tool helps you make sure that you have connected CNode/DNode -> switch correctly for Ethernet and IB clusters.

Please run from the VMS CNode (to ensure access to all nodes using SSH keys)

Latest Script Version

Latest Version Number: v2.0

Latest Version Release Date: Feb 28th, 2024

Pre-requisites

  1. Configure_network must have already been run on all nodes

  2. Switches are powered up

  3. Supported switch types: Mellanox, Aruba

  4. You have MGMT connectivity between cnodes <-> switch

  5. ssh_key if clush is not configured yet, default /home/vastdata/.ssh/id_rsa

How to Run the Script

  • vnetmap is available on VAST OS nodes 

  • The latest version can be found attached to this article

  • We need to provide the VAST internal network mgmt IPs, SSH to a Vast CNode, and run and run

cnodes_ips=$(clush -g cnodes echo | awk -F ':' '{print $1}' | paste -sd ',' -)
dnodes_ips=$(clush -g dnodes echo | awk -F ':' '{print $1}' | paste -sd ',' -)

NOTE: Running with -discover flag will try to discover the nodes based on the local node clush configuration file.

  • The IPs can be found on each node by looking for 69:m label on the mgmt bond

~:$ ip a | grep 69:m
    inet 10.10.128.32/18 brd 10.10.191.255 scope global bond0.69:m
export cnodes_ips=`echo 10.10.128.{1..20} | sed 's/ /,/g'`
export dnodes_ips=`echo 10.10.128.{100..109} | sed 's/ /,/g'`
export SWITCH_IPS="10.255.255.253,10.255.255.252" <<<< Only needed for ETH, for IB it will be auto discovered
  • Run the script:

ETH:
python3 vnetmap.py -s $MLX_IPS \    
   -i $cnodes_ips,$dnodes_ips \
   -u admin \
   -p admin \
   -k /home/vastdata/.ssh/id_rsa

IB:
python3 vnetmap.py -i $cnodes_ips,$dnodes_ips \
   -k /home/vastdata/.ssh/id_rsa -ib

 

All options

$ python3 vnetmap.py --help                                                                                                                    ok  12:59:03
usage: vnetmap.py [-h] -s SWITCH_IPS -i HOST_IPS -u USER [-p PASSWORD] [-k SSH_KEY] [-eth] [-ib] [-no-mtu] [-upload] [-subsystem] [-discover] [-no-diag] [--compact-output] [-d]
                  [--multiple-passwords] [--multiple-logins] [--version]

optional arguments:
  -h, --help            show this help message and exit
  -s SWITCH_IPS, --switch-ips SWITCH_IPS
                        switch ips
  -i HOST_IPS, --host-ips HOST_IPS
                        host ips
  -u USER, -user USER   user for switch
  -p PASSWORD, -password PASSWORD
                        password for switch
  -k SSH_KEY, --ssh-key SSH_KEY
                        ssh key to use
  -eth                  Force check eth network instead of discovering network type
  -ib                   Force check ib network instead of discovering network type
  -no-mtu, --no-mtu-check
                        check MTU for each internal IP\Interface
  -upload, --upload-s3  upload mapping file to vast
  -subsystem, --subsystem-breakdown
                        print output per subsystem - for large scale clusters
  -discover, --discover-nodes
                        auto discover nodes based on clush config
  -no-diag, --no-diag-network
                        creates a report to diagnos fabric issue using -no-diag disables this check
  --compact-output      Output just which nodes are connected to which switch
  -d, --debug           for the impatient..print every step.
  --multiple-passwords  For switches using different passwords
  --multiple-logins     For switches using different passwords and usernames (Will ask for both for each switch)
  --version, -v         show program's version number and exit

The output should look something like this:

Full topology
csmkfs-stable-cn3         10.27.70.80     Eth1/9/2     172.16.1.3      enp94s0f0    1c:34:da:57:6d:d4
csmkfs-stable-cn3         10.27.70.80     Eth1/9/1     172.16.2.3      enp94s0f1    1c:34:da:57:6d:d5
csmkfs-stable-dn101       10.27.70.80     Eth1/2       172.16.1.101    ens14f0      98:03:9b:97:15:1e
csmkfs-stable-dn100       10.27.70.80     Eth1/1       172.16.1.100    ens14f0      98:03:9b:97:15:5e
Connectivity issue detected, switch 10.27.70.80 has more then one internal network
Switch 10.27.70.80 has {'172.16.1', '172.16.2'}, should be only one internal subnet network below
csmkfs-stable-cn3         Eth1/9/2     172.16.1.3      enp94s0f0    1c:34:da:57:6d:d4
csmkfs-stable-cn3         Eth1/9/1     172.16.2.3      enp94s0f1    1c:34:da:57:6d:d5
csmkfs-stable-dn101       Eth1/2       172.16.1.101    ens14f0      98:03:9b:97:15:1e
csmkfs-stable-dn100       Eth1/1       172.16.1.100    ens14f0      98:03:9b:97:15:5e
2022-03-26 12:22:33,308 (P9561) {INFO} [vnetmap.py:139] Grabbing extanded info for switch: 10.27.70.82
2022-03-26 12:22:46,793 (P9561) {INFO} [vnetmap.py:139] Grabbing extanded info for switch: 10.27.70.80
2022-03-26 12:22:59,802 (P9561) {INFO} [vnetmap.py:139] Grabbing extanded info for switch: 10.27.70.81
2022-03-26 12:23:13,835 (P9561) {INFO} [vnetmap.py:168] Creating fabric report
2022-03-26 12:23:14,022 (P9561) {INFO} [vnetmap.py:311] Saving vnetmap output...
2022-03-26 12:23:14,037 (P9561) {INFO} [vnetmap.py:321] Output saved to /vast/log/vnetmap-csmkfs-stable-cn3-2022-03-26T12:23:14.txt
2022-03-26 12:23:14,038 (P9561) {INFO} [vnetmap.py:323] Getting info for upload...
2022-03-26 12:23:15,073 (P9561) {INFO} [vnetmap.py:325] Uploading vnetmap output to Vast support...
2022-03-26 12:23:22,374 (P9561) {INFO} [vnetmap.py:333] Uploaded vnetmap output to s3://vast-support/Customers/CS/KFS-STABLE/vnetmap/vnetmap-csmkfs-stable-cn3-2022-03-26T12:23:14.txt
This output tells us that all is well, and we can go to our happy place.
If it is NOT well, then the bottom section of output will tell you there is a conflict.

Vnetmap report includes the following by default

  • Mapping of the internal network

  • Per switch breakdown of connected CNodes\DNodes\subsystem

  • Per switch extended info breakdown, which includes:

    • Switch overview:

      • show version

      • show running-config

    • MLAG INFO:

      • show mlag

      • show mlag-vip

      • show mlag statistics

    • LLDP remote:

      • show lldp remote

    • MLAG-PROT-CHANNEL:

      • show int mlag-port-channel

      • show int mlag-port-channel summary

    • PROT-CHANNEL:

      • show int port-channel

      • show int port-channel summary

    • ETH Interfaces INFO:

      • show int ethernet status

      • show int ethernet description

      • show int ethernet link-diagnostics

      • show int ethernet capabilities

    • Transceivers INFO:

      • show int ethernet transceiver

      • show int ethernet transceiver brief

      • show int ethernet transceiver diagnostics

    • Rates and counters sample:

      • show int ethernet rates

      • show int ethernet counters

Download vnetmap.py

Log in to Download