QoS Overview

Quality of Service (QoS) policies let you set limits to define minimum and maximum allowed bandwidth and/or IOPS for read and/or write operations. A QoS policy can be assigned to a view or to a user.

QoS Policy Types

A QoS policy can be intended to provision VAST Cluster performance for a view or for one or more users.

In a view QoS policy, you can set static and/or capacity-based QoS limits for a specific view.
After creating a view QoS policy, you can assign it to a view when creating or editing a view.Creating ViewsModifying Views
View QoS policies are supported for NFSv3, NFSv4, SMB and S3.
In a user QoS policy, you can set static QoS limits, as well as the S3 connection limit for one or more users.
After creating a user QoS policy, you can assign it to the user(s) when editing a user. A user can be assigned only one QoS policy per tenant.Managing Local Users
User QoS policies are supported for NFSv3, NFSv4, SMB and S3.
Tip
Before creating user QoS policies, you need to enable the user QoS functionality on the VAST cluster by creating a default user QoS policy.

If an IO operation is subject to more than one QoS policy, it is handled as follows:

The IO operation consumes resources from all of applicable policies.
The IO operation is allowed to proceed only if it does not consume more resources than allowed by all the applicable policies.

For example, if a user has a limit of 1GB/s and is reading from a view that has a limit of 500MB/s, the user can only read at 500MB/s from that view. In addition, the user can read at 500MB/s from other views.

Static and Capacity-Based Limits

You can set static and/or capacity-based QoS limits.

Static limits are used-defined values of bandwidth and/or IOPS for read and/or write operations. These limits are fixed; they are not dependent on the used or provisioned capacity. Static limits can be set both in view QoS policies and in user QoS policies.
You can set a minimum and/or a maximum static limit.
- Minimum limits define the minimum guaranteed performance per view or per user when there is resource contention.
- Maximum limits define the maximum allowed performance per view or per user when there is no resource contention.
You can also set burst and credit static limits.
Static limits do not require a quota to be provisioned on the view path or for the user.
Capacity-based limits can be based on either of the following:
- Used logical capacity
  These limits cap the read/write bandwidth and/or IOPS per unit of used logical capacity. (Logical capacity is the amount of data written, before data reduction.) For example, you can configure a QoS policy to allow a maximum of 3 read IOPS per one GB of used logical capacity.
- Provisioned logical capacity
  These limits cap the read/write bandwidth and/or IOPS per unit of logical capacity provisioned by the soft limit of a quota on the view path. For example, you can configure the QoS policy to allow a maximum of 1MB/s of read bandwidth per GB of the capacity limit configured in a quota.
Capacity-based limits can only be set for view QoS policies.
Capacity-based limits require a quota to be provisioned on the view path. If you attach a QoS policy that sets capacity-based limits to a view that has no quota, VMS automatically creates a quota on the view path.
The quota on the view path itself is the only quota that affects the limits.

QoS limits take both data workload and metadata requests into account, as follows:

The size of a single IO for the purpose of a QoS limit is 1MB. In other words, the number of IOs per request is obtained by dividing the request size by 1MB.
A mutating metadata request (such as create, delete, rename, setattr) is counted as one write IO. Non-mutating metadata requests (such as getattr, lookup, list) are counted as one read IO for each. The size of a metadata request is taken to be 4KB.

Burst and Credit Limits

In addition to minimum and maximum QoS limits, you can set a burst limit and a credit limit.

When a burst limit is set, burst credits are accumulated as long as the workload consumes less resources than set by the maximum limit. The credits can later be spent to gain performance that exceeds the maximum limit, up to the configured burst limit.

The amount of credits that can be accumulated is capped by a credit limit set in a QoS policy.

For example, if a QoS policy defines a maximum limit of 100MB/s, a burst limit of 1000 MB/s and a credit limit of 10000, then after 100 seconds of idle time, the credit limit will be reached. Given that credit balance, the application can run at a 1000MB/s for 11 seconds following which it will be throttled down to 100MB/s.

Starting with VAST Cluster 5.2.2, if no credits are explicitly defined, setting a maximum static limit will cause the corresponding credits to accept a default value. The credit default value will be the maximum limit multiplied by 4. For example, if you set the maximum allowed read bandwidth to 500 and do not specify any value for the read bandwidth credit, the read bandwidth credit will automatically be set to 2000.

Burst duration can be calculated as follows:

 burst_duration = credit_limit / burst_limit

In case of a zero burst, the duration can be as low as 100 ms.

The time for credits to be replenished can be calculated as follows:

time_to_replenish = (credit_limit - used_credit) / (maximum_static_limit - current_BW_or_IOPS)

Where:

used_credit is the amount of credit used up to the current time.
Tip
This value can be taken from the burst metrics per view: burst_write_bw_used, burst_read_bw_used, burst_write_iops_used, burst_read_iops_used.
current_BW_or_IOPS is the current throughput or IOPS.

S3 Connection Limit

The S3 connection limit restricts the maximum number of S3 connections to the VAST cluster that can be opened by a client user. This helps prevent scenarios where a single client IP creates an enormous amount of S3 connections exhausting cluster's TCP connection resources.

When set in a QoS policy attached to a specific user or group, the S3 connection limit applies to that particular user or group.
When set in the default user QoS policy, the limit applies to any S3 user connecting to the VAST cluster.

Note
The S3 connection limit does not affect root users.

By default, this feature is enabled but no S3 connection limit is set.

A connection is attributed to a user only after the cluster has received the first request from that user. If multiple users share the same connection, this connection is attributed to the user which made the first request. When there is no first request yet, the user is considered to be unknown.

If the S3 connection limit is set for a user that has already opened some S3 connections, the existing connections are not affected by the new limit and are not taken into account when imposing the S3 connection limit.

Tip
For all users that have an associated QoS policy with this limit set to greater than zero, you can view the number of active S3 connections per user in user query results.

Total Limits

The QoS limits you set in a QoS policy can be either total or specific to the type of operations (read or write). Thus, you can restrict the bandwidth for write operations while leaving the read bandwidth unlimited, or you can set a total limit that will cap the total amount of read and write IOPS.

Total limits take metadata operations into account. For example, if you set a total IOPS limit to 1000, then at any given point in time, the sum of read, write and metadata IOPS cannot exceed 1000.

Combining QoS Limits

When static limits are used together with capacity-based limits, they limit performance that can be reached while still within the capacity-based limits. For example, setting a 1GB/s maximum and 10MB/s per GB would mean that after reaching 100GB, the performance will no longer increase.

However, if a burst limit is set, the performance can go beyond the specified capacity-based limit. For example, if you cap the bandwidth with a maximum limit of 300MB/s and a burst limit of 800MB/s while having a capacity-based limit of 500MB/s, the performance will be capped at 800MB/s.