The VAST Data platform offers a comprehensive set of performance and telemetry metrics that provide deep visibility into system behavior, workload performance, and Quality of Service (QoS) enforcement. These metrics are essential for monitoring infrastructure health, troubleshooting performance anomalies, validating SLAs, and enabling observability in multi-tenant environments. For example, they help detect bandwidth bottlenecks, track latency spikes, and identify “noisy neighbor” workloads. VAST metrics also support usage-based analytics and capacity forecasting, which are critical for optimizing resource allocation.
This enables Cloud Service Providers (CSPs) to use VAST metrics as the foundation for delivering transparent, metered services to tenants. CSPs can expose selected metrics—such as per-tenant IOPS, bandwidth, or latency—via customer-facing dashboards or API integrations. This enables tenant-level performance reporting and SLA validation while maintaining strict isolation and control. Metrics are collected across all system layers (CNodes, DNodes, switches) and are available in real time or over historical ranges via:
Prometheus Exporter: Exposes metrics in Prometheus/OpenMetrics format
REST API: Full access to raw and derived metrics

API to VMS diagram
Note: Metrics are stored in the VAST Management System (VMS) database and aggregated at 5-minute intervals (or at 1-minute granularity in VAST 5.3+), supporting both internal monitoring and external service reporting.
Metrics Visualization
VAST provides two main tools for visualizing system metrics: the Web UI Dashboard and Grafana Dashboards. These tools enable administrators and cloud providers to monitor performance, detect anomalies, and manage tenant-level observability.
Web UI Dashboard
The VAST Web UI provides a real-time dashboard that displays key cluster metrics, including capacity usage, IOPS, bandwidth, and top-consuming users and views. It provides a high-level overview and enables dynamic sorting to quickly identify performance hotspots or imbalances.
Tenant Managers also have access to this dashboard, but visibility is limited to their own data. It shows per-tenant capacity, IOPS, bandwidth, and usage trends, supporting self-service monitoring in multi-tenant environments.

VMS Dashboard
Grafana Dashboards
VAST provides a comprehensive suite of pre-built Grafana dashboards designed for deep observability and performance analysis. Key highlights include:
Version Compatibility: Works with VAST versions 5.1-sp40 and later using the built-in Prometheus exporter.
Easy Import: Dashboards are provided as
.jsonfiles that can be directly imported into your Grafana instance.Organized Views: Dashboards are organized by tenant, view, and node for targeted troubleshooting.
Use Cases: Ideal for real-time monitoring, historical analysis, QoS enforcement validation, and capacity planning.
These dashboards are production-ready and recommended as-is or as a reference for building custom visualizations. They help ensure consistent metric usage across VAST versions, reduce the chance of misinterpreting metric semantics, and simplify integration with external systems.
To use them, import the .json file, configure your Prometheus data source, and start visualizing metrics. Customized dashboards tailored to specific CSP use cases are also available upon request.
For more details, visit the VAST Grafana Dashboards repository.

VAST Grafana Dashboard
Recommended Expressions on VAST
Purpose | PromQL Expression |
|---|---|
Read IOPS | rate(vast_view_metrics_ViewMetrics_read_iops_count[5m]) |
Read Bandwidth | rate(vast_view_metrics_ViewMetrics_read_bw_sum[5m]) |
Read Latency | rate(vast_view_metrics_ViewMetrics_read_latency_sum[5m]) / rate(vast_view_metrics_ViewMetrics_read_latency_count[5m]) |
Write IOPS | rate(vast_view_metrics_ViewMetrics_write_iops_count[5m]) |
Write Bandwidth | rate(vast_view_metrics_ViewMetrics_write_bw_sum[5m]) |
Write Latency | rate(vast_view_metrics_ViewMetrics_write_latency_sum[5m]) / rate(vast_view_metrics_ViewMetrics_write_latency_count[5m]) |
QoS Throttling | rate(vast_view_metrics_ViewMetrics_qos_wait_for_budget_time_sum[5m]) / rate(vast_view_metrics_ViewMetrics_qos_wait_for_budget_time_count[5m]) |
Derived Metrics (from Version 5.3 and higher)
If PromQL is too complex or unsupported, VAST offers derived metrics. These metrics are based on periodic averages and are less accurate over longer time windows due to the averaging characteristics:
Purpose | Metric Name |
|---|---|
Read IOPS | vast_view_metrics_ViewMetrics_read_iops_time_avg |
Read Bandwidth | vast_view_metrics_ViewMetrics_read_bw_sum_time_avg |
Read Latency | vast_view_metrics_ViewMetrics_read_latency_avg |
Write IOPS | vast_view_metrics_ViewMetrics_write_iops_time_avg |
Write Bandwidth | vast_view_metrics_ViewMetrics_write_bw_time_avg |
Write Latency | vast_view_metrics_ViewMetrics_write_latency_avg |
QoS Throttling | vast_view_metrics_ViewMetrics_qos_wait_for_budget_time_avg |
Command line:
vastpy-cli --json get monitors/ad_hoc_query object_type=view time_frame=5m object_ids=3 prop_list=ViewMetrics,read_bw__time_avg prop_list=ViewMetrics,read_iops__time_avg prop_list=ViewMetrics,read_latency__avgOutput format:
"prop_list": [
"timestamp",
"object_id",
"ViewMetrics,read_bw__time_avg",
"ViewMetrics,read_iops__time_avg",
"ViewMetrics,read_latency__avg"
],QoS Metrics Overview
Metrics / Concept | Description |
|---|---|
vast_view_metrics_ViewMetrics_qos_wait_for_budget_time_avg | Windowed mean time requests in this view spent waiting on QoS budget during the scrape window. Indicates presence/degree of QoS gating. Mostly >0 since it measures the time a code section takes, which is part of IO processing. |
vast_view_metrics_ViewMetrics_qos_wait_for_budget_time_sum | Cumulative seconds of QoS wait accrued by the view (monotonic; use |
vast_view_metrics_ViewMetrics_qos_wait_for_budget_time_count | Cumulative count of affected events included in the |
vast_view_metrics_ViewMetrics_read_bw_avg vast_view_metrics_ViewMetrics_write_bw_avg | Windowed average delivered bandwidth for the view (bytes/s). It is useful to see if throughput is at or near the configured QoS cap. * |
vast_view_metrics_ViewMetrics_read_iops_time_avg vast_view_metrics_ViewMetrics_write_iops_time_avg | Windowed average IOPS for the view (ops/s). Helps separate small-IO vs. streaming patterns. * |
vast_user_read_bw vast_user_write_bw | Per-user windowed average bandwidth (bytes/s). Complements view-level utilization. |
vast_user_read_iops vast_user_write_iops | Per-user windowed average IOPS (ops/s). |
Notes:
Window length for “*_avg” metrics in the averaging window equals your Prometheus
scrape_interval(e.g., 15s), unless configured differently (prometheus.yml)Scopes & Endpoints:
/api/prometheusmetrics/views→ per-view (QoS, performance, etc.)/api/prometheusmetrics/users→ per-user (bandwidth, IOPS, etc.)
HELP/TYPE lines: Each series carries
# HELP <metric> <description>and# TYPE <metric> <type>. Treat these as the authoritative contract for your cluster/build.
Tracking Capacity Usage per User (UID)
VAST supports tracking storage usage by individual users (UIDs) through user-aware quotas. This eliminates the need for customers to walk through the entire view structure to manually calculate per-user usage.
You can enable user capacity tracking on any directory-level quota — such as those automatically created by the CSI driver — without needing per-user definitions or hard limits. Once enabled, VAST exports per-UID usage metrics via Prometheus, which can be visualized in Grafana.
Quick Setup via Web UI
Step 1 – Create or Edit a Quota (No Limits Required):
Navigate to Settings → Element Store → Quotas.
Create or edit a directory-level quota.
(Optional) Leave soft/hard limits blank for tracking-only quotas.
Enable the toggle: “User/Group Quotas”.
Under the Default User Rule, set limits to
0to avoid enforcement.Click Update
Step 2 – Monitor Prometheus Capacity Metrics Per-UID
Once user tracking is enabled, the following metrics are exported via Prometheus:
promql
vast_user_quota_used_capacity{cluster="<cluster>", identifier="<uid>", path="<view>"} Reports the logical space used (in bytes) per user and per directory
Can be grouped by
identifier(UID) orpath
Grafana Dashboard Reference
Use or customize VAST’s official Grafana dashboards to visualize UID usage:
Repository: vast-data/vast-grafana-dashboards
Recommended Dashboard:
Top Actors – Users
Client-Side Observability (NFS only)
VAST's vNFS Collector is an open-source tool that provides deep visibility into NFS workloads by capturing detailed I/O metrics for every NFS mount. It tracks per-operation counters for all key NFSv3 and NFSv4 commands, including READ, WRITE, LOOKUP, and DELETE, along with contextual metadata such as mount points, process names, user IDs, and environment variables like SLURM JOB ID. This rich dataset enables accurate workload profiling and performance tuning.
The collector supports flexible data forwarding, with local JSON logging and seamless integration into Prometheus (for Grafana dashboards), Kafka (for event-driven pipelines), and the VAST DataBase (for historical analytics via Trino, Spark, and Grafana).
For more information, visit: