Purpose
This tool helps you make sure that you have connected CNode/DNode -> switch correctly for Ethernet and IB clusters.
Please run from the VMS CNode (to ensure access to all nodes using SSH keys)
Latest Script Version
Latest Version Number: v2.0
Latest Version Release Date: Feb 28th, 2024
Pre-requisites
Configure_network must have already been run on all nodes
Switches are powered up
Supported switch types: Mellanox, Aruba
You have MGMT connectivity between cnodes <-> switch
ssh_key if clush is not configured yet, default /home/vastdata/.ssh/id_rsa
How to Run the Script
vnetmap is available on VAST OS nodes
The latest version can be found attached to this article
We need to provide the VAST internal network mgmt IPs, SSH to a Vast CNode, and run and run
cnodes_ips=$(clush -g cnodes echo | awk -F ':' '{print $1}' | paste -sd ',' -)
dnodes_ips=$(clush -g dnodes echo | awk -F ':' '{print $1}' | paste -sd ',' -)NOTE: Running with -discover flag will try to discover the nodes based on the local node clush configuration file.
The IPs can be found on each node by looking for 69:m label on the mgmt bond
~:$ ip a | grep 69:m
inet 10.10.128.32/18 brd 10.10.191.255 scope global bond0.69:mexport cnodes_ips=`echo 10.10.128.{1..20} | sed 's/ /,/g'`
export dnodes_ips=`echo 10.10.128.{100..109} | sed 's/ /,/g'`
export SWITCH_IPS="10.255.255.253,10.255.255.252" <<<< Only needed for ETH, for IB it will be auto discoveredRun the script:
ETH:
python3 vnetmap.py -s $MLX_IPS \
-i $cnodes_ips,$dnodes_ips \
-u admin \
-p admin \
-k /home/vastdata/.ssh/id_rsa
IB:
python3 vnetmap.py -i $cnodes_ips,$dnodes_ips \
-k /home/vastdata/.ssh/id_rsa -ib
All options
$ python3 vnetmap.py --help ok 12:59:03
usage: vnetmap.py [-h] -s SWITCH_IPS -i HOST_IPS -u USER [-p PASSWORD] [-k SSH_KEY] [-eth] [-ib] [-no-mtu] [-upload] [-subsystem] [-discover] [-no-diag] [--compact-output] [-d]
[--multiple-passwords] [--multiple-logins] [--version]
optional arguments:
-h, --help show this help message and exit
-s SWITCH_IPS, --switch-ips SWITCH_IPS
switch ips
-i HOST_IPS, --host-ips HOST_IPS
host ips
-u USER, -user USER user for switch
-p PASSWORD, -password PASSWORD
password for switch
-k SSH_KEY, --ssh-key SSH_KEY
ssh key to use
-eth Force check eth network instead of discovering network type
-ib Force check ib network instead of discovering network type
-no-mtu, --no-mtu-check
check MTU for each internal IP\Interface
-upload, --upload-s3 upload mapping file to vast
-subsystem, --subsystem-breakdown
print output per subsystem - for large scale clusters
-discover, --discover-nodes
auto discover nodes based on clush config
-no-diag, --no-diag-network
creates a report to diagnos fabric issue using -no-diag disables this check
--compact-output Output just which nodes are connected to which switch
-d, --debug for the impatient..print every step.
--multiple-passwords For switches using different passwords
--multiple-logins For switches using different passwords and usernames (Will ask for both for each switch)
--version, -v show program's version number and exitThe output should look something like this:
Full topology
csmkfs-stable-cn3 10.27.70.80 Eth1/9/2 172.16.1.3 enp94s0f0 1c:34:da:57:6d:d4
csmkfs-stable-cn3 10.27.70.80 Eth1/9/1 172.16.2.3 enp94s0f1 1c:34:da:57:6d:d5
csmkfs-stable-dn101 10.27.70.80 Eth1/2 172.16.1.101 ens14f0 98:03:9b:97:15:1e
csmkfs-stable-dn100 10.27.70.80 Eth1/1 172.16.1.100 ens14f0 98:03:9b:97:15:5e
Connectivity issue detected, switch 10.27.70.80 has more then one internal network
Switch 10.27.70.80 has {'172.16.1', '172.16.2'}, should be only one internal subnet network below
csmkfs-stable-cn3 Eth1/9/2 172.16.1.3 enp94s0f0 1c:34:da:57:6d:d4
csmkfs-stable-cn3 Eth1/9/1 172.16.2.3 enp94s0f1 1c:34:da:57:6d:d5
csmkfs-stable-dn101 Eth1/2 172.16.1.101 ens14f0 98:03:9b:97:15:1e
csmkfs-stable-dn100 Eth1/1 172.16.1.100 ens14f0 98:03:9b:97:15:5e
2022-03-26 12:22:33,308 (P9561) {INFO} [vnetmap.py:139] Grabbing extanded info for switch: 10.27.70.82
2022-03-26 12:22:46,793 (P9561) {INFO} [vnetmap.py:139] Grabbing extanded info for switch: 10.27.70.80
2022-03-26 12:22:59,802 (P9561) {INFO} [vnetmap.py:139] Grabbing extanded info for switch: 10.27.70.81
2022-03-26 12:23:13,835 (P9561) {INFO} [vnetmap.py:168] Creating fabric report
2022-03-26 12:23:14,022 (P9561) {INFO} [vnetmap.py:311] Saving vnetmap output...
2022-03-26 12:23:14,037 (P9561) {INFO} [vnetmap.py:321] Output saved to /vast/log/vnetmap-csmkfs-stable-cn3-2022-03-26T12:23:14.txt
2022-03-26 12:23:14,038 (P9561) {INFO} [vnetmap.py:323] Getting info for upload...
2022-03-26 12:23:15,073 (P9561) {INFO} [vnetmap.py:325] Uploading vnetmap output to Vast support...
2022-03-26 12:23:22,374 (P9561) {INFO} [vnetmap.py:333] Uploaded vnetmap output to s3://vast-support/Customers/CS/KFS-STABLE/vnetmap/vnetmap-csmkfs-stable-cn3-2022-03-26T12:23:14.txt
This output tells us that all is well, and we can go to our happy place.
If it is NOT well, then the bottom section of output will tell you there is a conflict.Vnetmap report includes the following by default
Mapping of the internal network
Per switch breakdown of connected CNodes\DNodes\subsystem
Per switch extended info breakdown, which includes:
Switch overview:
show version
show running-config
MLAG INFO:
show mlag
show mlag-vip
show mlag statistics
LLDP remote:
show lldp remote
MLAG-PROT-CHANNEL:
show int mlag-port-channel
show int mlag-port-channel summary
PROT-CHANNEL:
show int port-channel
show int port-channel summary
ETH Interfaces INFO:
show int ethernet status
show int ethernet description
show int ethernet link-diagnostics
show int ethernet capabilities
Transceivers INFO:
show int ethernet transceiver
show int ethernet transceiver brief
show int ethernet transceiver diagnostics
Rates and counters sample:
show int ethernet rates
show int ethernet counters
Download vnetmap.py
Log in to Download