One of the powers of the VAST architecture is that you can scale performance and storage separately. If you need more performance, you can add more CNodes. This means, however, that achieving peak performance with a VAST Data cluster requires balancing traffic across all CNodes.
ℹ️ Info
Because ENodes operate as both CNodes and DNodes, references to CNodes in this document also apply to ENodes.
CNodes have one or more NICs, and each NIC can have one or more virtual IPs (VIPs) assigned through a VIP pool. The VIPs can move between CNodes if one goes offline, such as an upgrade or a failure, without impacting clients. For the best performance, we want to spread traffic across as many VIPs as possible without pinning clients to specific VIPs, as they will move around during upgrades.
VAST load-balancing technologies
There are a couple of VAST tools available to balance the load across CNodes. We'll discuss them here and then go into specific suggestions by protocol.
VAST DNS Server
VAST Data clusters can be configured to run an embedded DNS Server, which dynamically returns results based on the IP addresses contained within a VIP pool. By default, this DNS server returns a single IP address from the corresponding VIP pool for each request. The IP address that is returned will be selected in a round-robin fashion. To limit caching by upstream DNS servers, it is configured to use a very low, but not zero, Time-To-Live (TTL).
For many clients and protocols, it is generally beneficial to change this behavior and return multiple (e.g., 8) IP addresses in each response. This can be controlled in VAST 4.7 HF18 and later by setting the following vsettings:
vtool vsettings set VAST_DNS_MULTIPLE_ANSWERS=true
vtool vsettings set VAST_DNS_MAX_ANSWERS=8Then, restart the VAST DNS Server by disabling and re-enabling it for this to take effect.
To remove caching entirely, the TTL can be set to 0 to ensure that every request receives a unique IP address. However, if the upstream DNS server is running Windows DNS, a TTL of zero is considered invalid and will be treated as a TTL=1.
In general, you want to avoid having clients mount specific IP addresses and instead use the VAST DNS Server.
ℹ️ Info
If you use autofs with VAST NFS, you should let the driver resolve DNS instead of autofs.
For additional DNS configuration, see
DNS Show Usage
dns show --id IDRequired Parameters
--id ID#Specifies which DNS server configuration to display.
Example
vcli: admin> dns show --id 1
+------------------------------+-------------------+
| ID | 1 |
| Name | dns1 |
| DNS Service IP | 192.0.2.0 |
| DNS Service Gateway | |
| DNS Service Suffix | mydns |
| DNS Service Subnet | 24 |
| DNS Service VLAN | 0 |
| Net-type | EXTERNAL_PORT |
| Enabled | True |
+---------------------+----------------------------+DNS List Usage
dns listExample
vcli: admin> dns list
+----+------+----------------+---------------------+--------------------+--------------------+------------------+---------------+---------+
| ID | Name | DNS Service IP | DNS Service Gateway | DNS Service Suffix | DNS Service Subnet | DNS Service VLAN | Net-type | Enabled |
+----+------+----------------+---------------------+--------------------+--------------------+------------------+---------------+---------+
| 1 | dns1 | 192.0.2.0 | | mydns | 24 | 0 | EXTERNAL_PORT | True |
+----+------+----------------+---------------------+--------------------+--------------------+------------------+---------------+---------+DNS Delete Usage
dns delete --id IDRequired Parameters
--id IDExample
vcli: admin> dns delete --id 1VAST NFS Client
Using the standard out-of-the-box NFS client, a host can only mount an NFS export from a single IP address. The VAST NFS client supports multipath, which allows a host to mount an NFS export and send traffic to multiple IP addresses. It can also make use of multiple NICs on the client for higher performance out of a single client.
The VAST NFS client is a Linux kernel module that is installed on each host that mounts the VAST Data cluster with multipath.
While the VAST DNS Server is not required when using the VAST NFS client, if the VAST DNS Server is configured for TTL=0 or to return multiple responses, the VAST NFS client can use this remoteports=dns , which can simplify its deployment on clients. See the VAST NFS Client mount parameter documentation for more information on this feature.
Protocols in detail
Here is guidance on using the tools above according to protocol.
S3
The VAST DNS Server supports both path-based and virtual-based S3 access. The former of these uses the same hostname for all requests. When using the VAST DNS Server, this hostname is the DNS hostname assigned to the pool that is being accessed.
Using the VAST DNS Server when using S3 is strongly recommended, as it easily allows traffic to be balanced across all nodes in the cluster whilst supporting both Path-Style and Virtual-Style S3 paths.
Virtual-based S3 access adds the bucket name to the front of the hostname, so if the VIP pool hostname was configured to be s3.vast1.example.com, then accessing the bucket with the name “bucket1” using virtual-based access would result in the client using the hostname bucket1.s3.vast1.example.com.
The VAST DNS Server supports such requests by responding to all hostnames below its configured hostname - in the example above, it would respond to any hostname matching *.s3.vast1.example.com.
For best connection balancing, configure the VAST DNS Server with TTL=0 if there is no upstream Windows DNS server. If there is an upstream Windows DNS server, configure the VAST DNS Server to return multiple responses instead.
CNAMEs
It is possible to use a DNS CNAME to alias another hostname to the DNS hostname configured on VAST. For example, the hostname above could be simplified to s3.example.com by creating a CNAME for that name, pointing to the full s3.vast1.example.com hostname. However, as CNAMEs are only able to map a single hostname, using this shorter name will NOT work for virtual-based requests.
SSL Certificates
SSL verification requires that the DNS hostname being used match a hostname in the SSL certificate. For path-based access, this means the hostname in the certificate needs to match the DNS hostname for the pool. For virtual-based access, the certificate needs to match both the pool hostname (eg, s3.vast1.example.com), AND the hostname including the bucket name at the start, which means a wildcard hostname is required (eg, *.s3.vast1.example.com).
If using a CNAME, then the SSL certificate should match the CNAME hostname.
SMB
For Windows and MacOS communicating over SMB, in many cases, configuring the VAST DNS Server to only return a single IP address, even with a TTL=1, may be sufficient, depending on when or how many clients attempt connection at the same time. However, this is dependent upon workload and use case.
For instance, in a campus environment, connections from laptops and even desktop systems will be generally done at random times from users throughout the day, so requests would be spread out, and the IP address returned from the DNS query would be different for those clients.
In the use case of a render or compute farm, configuring the VAST DNS Server to return multiple responses is the best way to spread traffic, as those clients would be performing their DNS query at the same time, and providing multiple IP Addresses in the DNS reply may achieve better distribution across multiple CNodes. However, in such a case, it would be up to the client or any upstream DNS servers to select from or further randomize the returned IP addresses.
When using SMB Multichannel, Windows will open up multiple TCP channels, but they are always to a single IP and won't span across CNodes.
NFS
The VAST NFS driver with multipath and nconnect is the most predictable way to spread the load across many CNodes and get the most performance over a single mount point. Because the CNode VIPs are included in the mount command, the mount can be to a hostname provided by the VAST DNS Server or a specific VIP – the driver will still send traffic to all the VIPs it was passed.
ℹ️ Info
Using the VAST NFS client with multipath is the recommended solution for AI and other high-performance workloads.
If the VAST NFS client is not an option, the next best method is to adjust the VAST DNS Server's TTL to 0, such that every client mount resolves to a new IP address. As mentioned previously, this method is not recommended in heterogeneous environments that contain Windows clients.
If the above are not an option, update the VAST DNS to return multiple responses to encourage hosts to connect to different IP addresses. When DNS responses include multiple IPs, many Linux resolvers will select a random one from the list.
VMware
For best practices on how to configure VMware systems with datastores on a VAST cluster, see Best Practices Guide for VMware on VAST Data.
CSI Driver for Kubernetes
To ensure proper CNode utilization in Kubernetes with the VAST CSI driver:
Configure the VAST DNS Server to return multiple addresses
Use the VAST NFS client on the k8s workernodes/hosts
For each storageClass, use the vipPoolFQDN to specify the VIP pool and add a mountOption to set the
remoteports=dns, for example:
storageClasses:
vastdata-multipath:
secretName: "vms-creds"
secretNamespace: "vast-csi"
vipPool: ""
vipPoolFQDN: "pool.vastcluster.domain.com"
vipPoolFQDNRandomPrefix: true
storagePath: "/k8s"
viewPolicy: "k8s-custom"
mountOptions:
- remoteports=dns
- nconnect=16
- vers=3
- proto=tcp
- spread_reads
- spread_writesVAST Database
To spread VAST Database traffic across CNodes for Trino, Spark, and the vastdb_sdk, set the data_endpoints variable to include all CNode IPs you want to participate.
For Trino, this is set by the data_endpoints setting in the catalog properties file. For Spark this is set by the spark.ndb.data_endpoints in the spark-defaults.conf file. For the vastdb_sdk this is set by the data_endpoints property using a custom QueryConfig.
For more information and best practices on tuning access to the VAST Database, see https://vast-data.github.io/data-platform-field-docs/vast_database/tuning/tuning.html