NVMe-TCP Block Device Client Configuration

Prev Next

This document provides a clear set of instructions for setup and connectivity from a Linux
client (Rocky based) to the VAST Data cluster block subsystem
VAST best practice requires all networking layers between clients and storage be
configured with MTU 9000.

1. Install Required Tools

NVMe CLI Tool
To interact with NVMe devices, install the NVMe CLI tool:
sudo yum install nvme-cli
After installation, you can use commands such as:
nvme discover: Discover NVMe targets.
nvme connect: Connect to NVMe-over-Fabrics subsystems.
nvme disconnect: Disconnect NVMe devices.

2. Get client nqn for creating a host on VAST


To add the host to the Vast cluster, you will need the host NQN, which can be found using the
command:
cat /etc/nvme/hostnqn
This NQN needs to be configured in the VAST cluster's host properties.

3. Connecting to the Vast Cluster


Load Kernel Modules
Load the necessary kernel modules to enable NVMe over Fabrics:
sudo modprobe nvme
sudo modprobe nvme-fabrics


To load the modules automatically:
Create a file /etc/modules-load.d/nvme.conf
[root@KVMHOST modules-load.d]# cat /etc/modules-load.d/nvme.conf
nvme
nvme-fabrics


And execute:
sudo dracut -f


3.1 NVMe Subsystems Discovery

To discover available NVMe subsystems over TCP, use the following command:
sudo nvme discover -t tcp -a <vip_ip> -s 4420

• Replace <vip_ip> with the IP address of the Vast cluster.

3.2 Connect to an NVMe Subsystem

After discovering an NVMe subsystem, establish a connection 2 steps
first run the command below to establish a connection:
sudo nvme connect -t tcp -n <Vast_NQN> -a <vip_ip>
• Replace <viewnqn> with the discovered NQN.
• Replace <vip_ip> with the IP address of the Vast cluster. And then run the connect all command to connect all paths
sudo nvme connect-all -t tcp -a <vip_ip> -s 4420


From this point on, if you want to update the volume mapping (e.g new volume(s) were added or
removed) Just run the connect-all command
sudo nvme connect-all


3.3 List Available Mapped Volumes

To see a list of connected NVMe volumes:
sudo nvme list


3.4 List NVMe controller and paths

To list all connected NVMe devices:
sudo nvme list-subsys


This provides details about each connected NVMe device, including the device path, policy,
model.
You should see output similar to the following:
[vastdata@localhost ~]$ sudo nvme list-subsys
nvme-subsys0 - NQN=nqn.2024-08.com.vastdata:ef992044-0c8e-557a-a629-
4d3c9abd9f9d:default:subsystem-3
hostnqn=nqn.2014-08.org.nvmexpress:uuid:27590942-6282-d720-
eae9-fdd2d81355d4
iopolicy=round-robin
\
+- nvme9 tcp traddr=172.27.133.9,trsvcid=4420 live
+- nvme8 tcp traddr=172.27.133.8,trsvcid=4420 live
+- nvme7 tcp traddr=172.27.133.7,trsvcid=4420 live
+- nvme6 tcp traddr=172.27.133.6,trsvcid=4420 live
+- nvme5 tcp traddr=172.27.133.5,trsvcid=4420 live
+- nvme4 tcp traddr=172.27.133.4,trsvcid=4420 live
+- nvme3 tcp traddr=172.27.133.3,trsvcid=4420 live
+- nvme2 tcp traddr=172.27.133.2,trsvcid=4420 live
+- nvme16 tcp traddr=172.27.133.16,trsvcid=4420 live
+- nvme15 tcp traddr=172.27.133.15,trsvcid=4420 live
+- nvme14 tcp traddr=172.27.133.14,trsvcid=4420 live
+- nvme13 tcp traddr=172.27.133.13,trsvcid=4420 live
+- nvme12 tcp traddr=172.27.133.12,trsvcid=4420 live
+- nvme11 tcp traddr=172.27.133.11,trsvcid=4420 live
+- nvme10 tcp traddr=172.27.133.10,trsvcid=4420 live
+- nvme0 tcp traddr=172.27.133.1,trsvcid=4420 live

Disconnect Existing Connections
sudo nvme disconnect-all

4. Create transport rules

Create new file:
sudo vi /lib/udev/rules.d/71-nvmf-vastdata.rules
Put this content in the file:
(VAST VAST 5.3.0 – 5.3.2)

#Enable round-robin for Vast Data Block Controller

ACTION=="add|change", SUBSYSTEM=="nvme-subsystem",
ATTR{subsystype}=="nvm", ATTR{model}=="VastData", RUN+="/bin/sh -c
'echo round-robin > /sys/class/nvme-subsystem/%k/iopolicy'"

Run:
sudo udevadm control --reload-rules
sudo udevadm trigger


(VAST 5.3.2 and above)

# Enable round-robin for Vast Data Block Controller

ACTION=="add|change", SUBSYSTEM=="nvme-subsystem",
ATTR{subsystype}=="nvm", ATTR{model}=="VASTData", RUN+="/bin/sh -c
'echo round-robin > /sys/class/nvme-subsystem/%k/iopolicy'"

Run:
sudo udevadm control --reload-rules
sudo udevadm trigger

Ensure HA is properly working

To make sure High Availability is working as expected, the  "nvme_core.max_retries" parameter  should be set to a non-zero value. The default should be set to 5.

To set max_retries=5 ....

You can query the OS by running :

cat /sys/module/nvme_core/parameters/max_retries

or

grep . /sys/module/nvme_core/parameters/*

 

If the value is 0, you want to set and persist nvme_core.max_retries=5.

 Persistent Method (Recommended): /etc/modprobe.d/

 Create or edit the config file:

sudo nano /etc/modprobe.d/nvme_core.conf

Add this line:

options nvme_core max_retries=5

Update the initramfs (so the setting applies at boot):

  • On Ubuntu / Debian:

·       sudo update-initramfs -u
  • On RHEL / CentOS / Fedora:

·       sudo dracut -f

 Reboot:

sudo reboot

Verify the new value:

cat /sys/module/nvme_core/parameters/max_retries

Expected output: 5

If your NVMe driver is built into the kernel (not a module)

If /sys/module/nvme_core exists → you’re good.
If not, set it via GRUB:

 Edit GRUB:

sudo nano /etc/default/grub

 Add nvme_core.max_retries=5 to:

GRUB_CMDLINE_LINUX_DEFAULT="quiet splash nvme_core.max_retries=5"

Update GRUB:

  • Ubuntu/Debian:

sudo update-grub
  • RHEL/CentOS:

sudo grub2-mkconfig -o /boot/grub2/grub.cfg

 Reboot and verify again:

cat /sys/module/nvme_core/parameters/max_retries

5. Troubleshooting

Common Issues and Solutions


5.1 NVMe Subsystem Not Found

• Cause: Incorrect IP address or network issue.
• Solution: Verify that <vip_ip> is correct and that the host has network connectivity to the VAST cluster.


5.2 NVMe Device Not Appearing

• Cause: NVMe connection not established or missing kernel modules.
• Solution: Ensure kernel modules are loaded using:
sudo modprobe nvme
sudo modprobe nvme-fabrics


5.3 Logs and Diagnostics

• Use dmesg to check kernel logs for errors related to NVMe:
dmesg | grep nvme