This document provides a clear set of instructions for setup and connectivity from a Linux
client (Rocky based) to the VAST Data cluster block subsystem
VAST best practice requires all networking layers between clients and storage be
configured with MTU 9000.
1. Install Required Tools
NVMe CLI Tool
To interact with NVMe devices, install the NVMe CLI tool:sudo yum install nvme-cli
After installation, you can use commands such as:
• nvme discover: Discover NVMe targets.
• nvme connect: Connect to NVMe-over-Fabrics subsystems.
• nvme disconnect: Disconnect NVMe devices.
2. Get client nqn for creating a host on VAST
To add the host to the Vast cluster, you will need the host NQN, which can be found using the
command:cat /etc/nvme/hostnqn
This NQN needs to be configured in the VAST cluster's host properties.
3. Connecting to the Vast Cluster
Load Kernel Modules
Load the necessary kernel modules to enable NVMe over Fabrics:sudo modprobe nvmesudo modprobe nvme-fabrics
To load the modules automatically:
Create a file /etc/modules-load.d/nvme.conf[root@KVMHOST modules-load.d]# cat /etc/modules-load.d/nvme.confnvmenvme-fabrics
And execute:sudo dracut -f
3.1 NVMe Subsystems Discovery
To discover available NVMe subsystems over TCP, use the following command:sudo nvme discover -t tcp -a <vip_ip> -s 4420
• Replace <vip_ip> with the IP address of the Vast cluster.
3.2 Connect to an NVMe Subsystem
After discovering an NVMe subsystem, establish a connection 2 steps
first run the command below to establish a connection:sudo nvme connect -t tcp -n <Vast_NQN> -a <vip_ip>
• Replace <viewnqn> with the discovered NQN.
• Replace <vip_ip> with the IP address of the Vast cluster. And then run the connect all command to connect all pathssudo nvme connect-all -t tcp -a <vip_ip> -s 4420
From this point on, if you want to update the volume mapping (e.g new volume(s) were added or
removed) Just run the connect-all commandsudo nvme connect-all
3.3 List Available Mapped Volumes
To see a list of connected NVMe volumes:sudo nvme list
3.4 List NVMe controller and paths
To list all connected NVMe devices:sudo nvme list-subsys
This provides details about each connected NVMe device, including the device path, policy,
model.
You should see output similar to the following:[vastdata@localhost ~]$ sudo nvme list-subsysnvme-subsys0 - NQN=nqn.2024-08.com.vastdata:ef992044-0c8e-557a-a629-4d3c9abd9f9d:default:subsystem-3hostnqn=nqn.2014-08.org.nvmexpress:uuid:27590942-6282-d720-eae9-fdd2d81355d4iopolicy=round-robin\+- nvme9 tcp traddr=172.27.133.9,trsvcid=4420 live+- nvme8 tcp traddr=172.27.133.8,trsvcid=4420 live+- nvme7 tcp traddr=172.27.133.7,trsvcid=4420 live+- nvme6 tcp traddr=172.27.133.6,trsvcid=4420 live+- nvme5 tcp traddr=172.27.133.5,trsvcid=4420 live+- nvme4 tcp traddr=172.27.133.4,trsvcid=4420 live+- nvme3 tcp traddr=172.27.133.3,trsvcid=4420 live+- nvme2 tcp traddr=172.27.133.2,trsvcid=4420 live+- nvme16 tcp traddr=172.27.133.16,trsvcid=4420 live+- nvme15 tcp traddr=172.27.133.15,trsvcid=4420 live+- nvme14 tcp traddr=172.27.133.14,trsvcid=4420 live+- nvme13 tcp traddr=172.27.133.13,trsvcid=4420 live+- nvme12 tcp traddr=172.27.133.12,trsvcid=4420 live+- nvme11 tcp traddr=172.27.133.11,trsvcid=4420 live+- nvme10 tcp traddr=172.27.133.10,trsvcid=4420 live+- nvme0 tcp traddr=172.27.133.1,trsvcid=4420 live
Disconnect Existing Connectionssudo nvme disconnect-all
4. Create transport rules
Create new file:sudo vi /lib/udev/rules.d/71-nvmf-vastdata.rules
Put this content in the file:
(VAST VAST 5.3.0 – 5.3.2)
#Enable round-robin for Vast Data Block Controller
ACTION=="add|change", SUBSYSTEM=="nvme-subsystem",
ATTR{subsystype}=="nvm", ATTR{model}=="VastData", RUN+="/bin/sh -c
'echo round-robin > /sys/class/nvme-subsystem/%k/iopolicy'"Run:sudo udevadm control --reload-rules
sudo udevadm trigger
(VAST 5.3.2 and above)
# Enable round-robin for Vast Data Block Controller
ACTION=="add|change", SUBSYSTEM=="nvme-subsystem",
ATTR{subsystype}=="nvm", ATTR{model}=="VASTData", RUN+="/bin/sh -c
'echo round-robin > /sys/class/nvme-subsystem/%k/iopolicy'"Run:sudo udevadm control --reload-rulessudo udevadm trigger
Ensure HA is properly working
To make sure High Availability is working as expected, the "nvme_core.max_retries" parameter should be set to a non-zero value. The default should be set to 5.
To set max_retries=5 ....
You can query the OS by running :
cat /sys/module/nvme_core/parameters/max_retries
or
grep . /sys/module/nvme_core/parameters/*
If the value is 0, you want to set and persist nvme_core.max_retries=5.
Persistent Method (Recommended): /etc/modprobe.d/
Create or edit the config file:
sudo nano /etc/modprobe.d/nvme_core.confAdd this line:
options nvme_core max_retries=5Update the initramfs (so the setting applies at boot):
On Ubuntu / Debian:
· sudo update-initramfs -uOn RHEL / CentOS / Fedora:
· sudo dracut -fReboot:
sudo rebootVerify the new value:
cat /sys/module/nvme_core/parameters/max_retriesExpected output: 5
If your NVMe driver is built into the kernel (not a module)
If /sys/module/nvme_core exists → you’re good.
If not, set it via GRUB:
Edit GRUB:
sudo nano /etc/default/grub Add nvme_core.max_retries=5 to:
GRUB_CMDLINE_LINUX_DEFAULT="quiet splash nvme_core.max_retries=5"Update GRUB:
Ubuntu/Debian:
sudo update-grubRHEL/CentOS:
sudo grub2-mkconfig -o /boot/grub2/grub.cfgReboot and verify again:
cat /sys/module/nvme_core/parameters/max_retries5. Troubleshooting
Common Issues and Solutions
5.1 NVMe Subsystem Not Found
• Cause: Incorrect IP address or network issue.
• Solution: Verify that <vip_ip> is correct and that the host has network connectivity to the VAST cluster.
5.2 NVMe Device Not Appearing
• Cause: NVMe connection not established or missing kernel modules.
• Solution: Ensure kernel modules are loaded using:sudo modprobe nvmesudo modprobe nvme-fabrics
5.3 Logs and Diagnostics
• Use dmesg to check kernel logs for errors related to NVMe:dmesg | grep nvme