Intro
This guide walks through starting the SkyPilot API server, verifying Kubernetes compute credentials, and configuring VAST Data as a storage backend.
Prerequisites
SkyPilot installed
A running Kubernetes cluster with a valid kubeconfig
VastData S3 endpoint URL and access credentials
Python 3 with pip
Step 1: Start the SkyPilot API Server
Start the SkyPilot API server, binding it to all network interfaces so it is accessible remotely:
sky api start --host 0.0.0.0Expected output:
✓ SkyPilot API server started.
├── SkyPilot API server and dashboard: http://0.0.0.0:46580
└── View API server logs at: ~/.sky/api_server/server.logThe API server and its web dashboard are now available on port 46580. If running from source, you can rebuild the dashboard with:
npm --prefix sky/dashboard install && npm --prefix sky/dashboard run buildStep 2: Verify Kubernetes Credentials
Confirm that SkyPilot can use your Kubernetes cluster for compute:
sky check kubernetesExpected output:
Checking credentials to enable infra for SkyPilot.
Kubernetes: enabled [compute]
Allowed contexts:
└── kubernetes-admin@kubernetes: enabled.Step 3: Check VastData Credentials (Initial State)
Run the VastData credential check to see what configuration is needed:
sky check vastdataIf VastData is not yet configured, the output will show it as disabled and provide setup instructions:
Checking credentials to enable infra for SkyPilot.
VastData: disabled
Reason [storage]: [vastdata] profile is not set in ~/.vastdata/vastdata.credentials. Additionally, [vastdata] profile is not set in ~/.vastdata/vastdata.config. Run the following commands:
$ pip install boto3
$ AWS_SHARED_CREDENTIALS_FILE=~/.vastdata/vastdata.credentials aws configure --profile vastdata
$ AWS_CONFIG_FILE=~/.vastdata/vastdata.config aws configure set endpoint_url <ENDPOINT_URL> --profile vastdataFollow the steps below to configure it.
Step 4: Install boto3
VastData storage integration requires boto3 (the AWS SDK for Python), which is used to communicate with the S3-compatible endpoint:
pip install boto3Step 5: Configure VastData Access Credentials
Use the AWS CLI to write your VastData S3 access key and secret key into a dedicated credentials file (~/.vastdata/vastdata.credentials):
AWS_SHARED_CREDENTIALS_FILE=~/.vastdata/vastdata.credentials \
aws configure --profile vastdataYou will be prompted for:
Prompt | Value |
|---|---|
AWS Access Key ID | Your VastData access key |
AWS Secret Access Key | Your VastData secret key |
Default region name | (leave blank — press Enter) |
Default output format | (leave blank — press Enter) |
This creates the file ~/.vastdata/vastdata.credentials with a [vastdata] profile.
Step 6: Configure the VastData S3 Endpoint URL
Set the VastData S3-compatible endpoint URL in the config file (~/.vastdata/vastdata.config):
AWS_CONFIG_FILE=~/.vastdata/vastdata.config \
aws configure set endpoint_url <ENDPOINT_URL> --profile vastdataReplace <ENDPOINT_URL> with your VastData S3 endpoint (e.g., http://172.27.115.1).
Step 7: Verify VastData Is Enabled
Re-run the credential check to confirm VastData storage is now configured:
sky check vastdataExpected output:
Checking credentials to enable infra for SkyPilot.
VastData: enabled [storage]
🎉 Enabled infra 🎉
VastData [storage]VastData is now registered as a storage backend in SkyPilot. You can now use it for file mounts and managed storage in your task YAML files.
Step 8: Create a Task YAML with a VastData Mount
Create a YAML file (e.g., test_vast_mount.yaml) that mounts a VastData bucket and runs a command against it:
file_mounts:
/data:
source: vastdata://skypilot
mode: MOUNT
resources:
cloud: Kubernetes
cpus: 2
run: |
ls /datasource: vastdata://skypilot— Mounts the VastData bucket named skypilot using the FUSE-based mounter.mode: MOUNT— The bucket is mounted as a live filesystem (as opposed to COPY, which downloads files at setup time).resources.cloud: Kubernetes— The task will run on the Kubernetes cluster verified in Step 2.
Step 9: Launch the Task
Launch the task with sky launch:
sky launch test_vast_mount.yamlSkyPilot will display the chosen resources and ask for confirmation:
Considered resources (1 node):
--------------------------------------------------------------------------------------------------
INFRA INSTANCE vCPUs Mem(GB) GPUS COST ($) CHOSEN
--------------------------------------------------------------------------------------------------
Kubernetes (kubernetes-...@kubernetes) - 2 2 - 0.00 ✔
--------------------------------------------------------------------------------------------------
Launching a new cluster 'sky-19aa-vastdata'. Proceed? [Y/n]:Type y to proceed. SkyPilot will:
Provision a Kubernetes pod with the requested resources.
Mount the VastData bucket at
/datausing a FUSE filesystem.Run the
ls /datacommand inside the pod.
Expected output:
✓ Cluster launched: sky-19aa-vastdata.
⚙︎ Syncing files.
Mounting (to 1 node): vastdata://skypilot -> /data
✓ Storage mounted.
⚙︎ Job submitted, ID: 1
(task, pid=1862) aaa
(task, pid=1862) boto3-test.txt
(task, pid=1862) created_by_goofys
(task, pid=1862) created_by_rclone
(task, pid=1862) hosts
✓ Job finished (status: SUCCEEDED).The ls /data command lists the contents of the VastData bucket, confirming the mount is working.
Step 10: SSH into the Cluster and Inspect the Mount
You can SSH into the running cluster to interactively explore the mounted storage:
ssh sky-19aa-vastdataOnce connected, verify the mount:
# List files in the mounted bucket
ls -la /data/
# Check the FUSE mount entry
mount | grep /dataExpected output:
skypilot on /data type fuse (rw,nosuid,nodev,relatime,user_id=0,group_id=0,default_permissions,allow_other)This confirms the VastData bucket is mounted as a read-write FUSE filesystem. You can also check disk space reported by the mount:
df | grep /dataskypilot 1099511627776 0 1099511627776 0% /dataTest 2: Cached Mount Mode (MOUNT_CACHED)
The MOUNT_CACHED mode uses rclone with VFS caching. Files are cached locally on disk, giving faster repeated reads and full read-write support with write-back to the remote bucket. This is ideal for workloads that read the same files multiple times or need to write results back to the bucket.
Step 12: Create a Cached Mount Task YAML
Create a YAML file (e.g., test_vast_cache.yaml) that uses MOUNT_CACHED instead of MOUNT:
file_mounts:
/data:
source: vastdata://skypilot
mode: MOUNT_CACHED
resources:
cloud: Kubernetes
cpus: 2
run: |
ls /datamode:
MOUNT_CACHED— The bucket is mounted via rclone with a local VFS cache. Reads are cached on the node's local disk (up to 10 GB by default), and writes are buffered and flushed back to the remote bucket.
Step 13: Launch the Cached Mount Task
sky launch test_vast_cache.yamlSkyPilot will display the chosen resources and ask for confirmation:
Considered resources (1 node):
--------------------------------------------------------------------------------------------------
INFRA INSTANCE vCPUs Mem(GB) GPUS COST ($) CHOSEN
--------------------------------------------------------------------------------------------------
Kubernetes (kubernetes-...@kubernetes) - 2 2 - 0.00 ✔
--------------------------------------------------------------------------------------------------
Launching a new cluster 'sky-7cb4-vastdata'. Proceed? [Y/n]:Type y to proceed. Notice the log line says "Mounting cached mode" instead of just "Mounting":
⚙︎ Syncing files.
Mounting cached mode (to 1 node): vastdata://skypilot -> /data
✓ Storage mounted.After the job runs, you will also see a cache upload confirmation:
(task, pid=1849) aaa
(task, pid=1849) boto3-test.txt
(task, pid=1849) created_by_goofys
(task, pid=1849) created_by_rclone
(task, pid=1849) hosts
(task, pid=1849) skypilot: cached mount upload complete (took 11s)
✓ Job finished (status: SUCCEEDED).The line skypilot: cached mount upload complete confirms that any locally-cached writes were flushed back to the VastData bucket.
Step 14: SSH and Inspect the Cached Mount
SSH into the cluster:
ssh sky-7cb4-vastdataVerify the files are visible:
ls /data/aaa boto3-test.txt created_by_goofys created_by_rclone hostsInspect the rclone VFS process to see the caching parameters:
ps -aef | grep vfsExpected output:
sky 1726 1 0 09:52 ? 00:00:00 /usr/bin/rclone mount sky-vastdata-skypilot:skypilot /data \
--daemon --daemon-wait 10 \
--log-file /home/sky/.sky/rclone_log/4caa791091d21d23e63637080226f370.log \
--log-level INFO --allow-other \
--vfs-cache-mode full \
--dir-cache-time 10s \
--vfs-cache-poll-interval 10s \
--cache-dir /home/sky/.cache/rclone/4caa791091d21d23e63637080226f370 \
--vfs-fast-fingerprint \
--vfs-cache-max-size 10G \
--vfs-write-back 1sKey rclone VFS flags to note:
Flag | Description |
|---|---|
| Full read-write caching — all reads and writes go through the local cache |
| Maximum local cache size before eviction |
| Writes are flushed to the remote bucket after 1 second of inactivity |
| Directory listings are cached for 10 seconds |
| Uses size and modification time (not checksums) for faster cache validation |
Test 3: Copy Mode (COPY)
The COPY mode downloads the entire bucket contents to the local filesystem at launch time. The files are plain local files — no FUSE mount, no background process. This is the simplest and most compatible mode, ideal when your workload needs fast local I/O, and the dataset fits on disk.
Step 16: Create a Copy Mode Task YAML
Create a YAML file (e.g., test_vast_copy.yaml) that uses COPY mode:
file_mounts:
/data:
source: vastdata://skypilot
mode: COPY
resources:
cloud: Kubernetes
cpus: 2
run: |
ls /datamode: COPY— The bucket contents are downloaded to /data during the file sync phase. There is no live connection to the remote bucket after the copy completes.
Step 17: Launch the Copy Mode Task
sky launch test_vast_copy.yamlType y to confirm. Notice the log line says "Syncing" rather than "Mounting":
⚙︎ Syncing files.
Syncing (to 1 node): vastdata://skypilot -> /data
✓ Synced file_mounts.Expected job output:
(task, pid=1852) aaa
(task, pid=1852) boto3-test.txt
(task, pid=1852) created_by_goofys
(task, pid=1852) created_by_rclone
(task, pid=1852) hosts
✓ Job finished (status: SUCCEEDED).Step 18: SSH and Inspect the Copy
SSH into the cluster:
ssh sky-f053-vastdataVerify the files are present:
ls /data/aaa boto3-test.txt created_by_goofys created_by_rclone hostsConfirm there is no FUSE mount — the files are regular local files on the pod's filesystem:
df | grep /dataThis returns no output, confirming /data is not a separate mount point. The files were copied directly into the pod's local filesystem during setup.
Step 19: Clean Up the COPY Cluster
sky down sky-f053-vastdataTest 4: Auto-Create a New Bucket
SkyPilot can automatically create a new VastData bucket and mount it. This is useful when your task needs a fresh, empty storage location — for example, to write output data or checkpoints.
Step 20: Create a Bucket-Creation Task YAML
Create a YAML file (e.g., test_vast_create_bucket.yaml) that instructs SkyPilot to create a new bucket and mount it:
file_mounts:
/data:
name: skypilotnew
source: ~
store: vastdata
resources:
cloud: Kubernetes
cpus: 2
run: |
ls /dataname: skypilotnew— The name of the new bucket to create on VastData.source: ~— Indicates that the local home directory contents should be synced (use ~ as a minimal placeholder; SkyPilot will create the bucket even if nothing is uploaded).store: vastdata— Tells SkyPilot to create the bucket on VastData (rather than AWS S3, GCS, etc.).
Step 21: Launch the Task
sky launch test_vast_create_bucket.yamlType y to confirm. SkyPilot will create the bucket before launching the cluster:
Created S3 bucket 'skypilotnew' in auto
⚙︎ Launching on Kubernetes.
└── Pod is up.
✓ Cluster launched: sky-405f-vastdata.
⚙︎ Syncing files.
Mounting (to 1 node): skypilotnew -> /data
✓ Storage mounted.
✓ Job finished (status: SUCCEEDED).The bucket is created on the VastData S3 endpoint and then mounted into the pod at /data.
Step 22: Verify with sky storage ls
List all SkyPilot-managed storage to confirm the new bucket exists:
sky storage lsExpected output:
NAME UPDATED STORE COMMAND STATUS
skypilotnew 52 secs ago VASTDATA sky launch test_vast_crea... READYThe bucket is tracked by SkyPilot and can be reused, mounted by other tasks, or deleted with sky storage delete skypilotnew.
Step 23: SSH and Inspect the New Bucket Mount
SSH into the cluster:
ssh sky-405f-vastdataVerify the bucket is mounted as a FUSE filesystem:
df | grep /dataExpected output:
skypilotnew 1099511627776 0 1099511627776 0% /dataThe newly created bucket is mounted and ready for use. Your task's run commands can write data to /data, and it will be stored in the VastData bucket skypilotnew.
Step 24: Clean Up
Tear down the cluster:
sky down sky-405f-vastdataOptionally, delete the bucket if it is no longer needed:
sky storage delete skypilotnewMount Mode Comparison
Aspect |
|
|
|
|---|---|---|---|
Backend | FUSE (goofys-based) | rclone with VFS cache | rclone sync (one-time download) |
Data transfer | On-demand per read | On-demand + local cache | Full download at launch |
Read performance | Network-bound (every read) | Fast for repeated reads (cached) | Native local disk speed |
Write support | Limited | Full read-write with write-back | Local only (not synced back) |
Local disk usage | Minimal | Up to 10 GB cache (configurable) | Full dataset size |
Live remote connection | Yes (FUSE process) | Yes (rclone process) | No |
Best for | Streaming large files, read-once workloads | Iterative reads, training data, read-write workloads | Small datasets, max I/O performance, offline access |
Summary
Step | Action | Result |
|---|---|---|
1 |
| API server + dashboard running on port 46580 |
2 |
| Kubernetes enabled for compute |
3 |
| Shows what VastData config is missing |
4 |
| S3 client library installed |
5 |
| Access credentials stored |
6 |
| S3 endpoint configured |
7 |
| VastData enabled for storage |
8 |
| Task YAML with VastData FUSE mount |
9 |
| Cluster launched, bucket mounted, job succeeded |
10 |
| Interactive access to inspect the FUSE mount |
11 |
| MOUNT cluster torn down |
12 |
| Task YAML with VastData cached mount |
13 |
| Cluster launched, cached mount active, job succeeded |
14 |
| Interactive access to inspect rclone VFS cache |
15 |
| MOUNT_CACHED cluster torn down |
16 |
| Task YAML with VastData copy mode |
17 |
| Cluster launched, files synced, job succeeded |
18 |
| Interactive access — no FUSE mount, plain local files |
19 |
| COPY cluster torn down |
20 |
| Task YAML that auto-creates a new VastData bucket |
21 |
| Bucket created, cluster launched, mounted |
22 |
| New bucket visible in SkyPilot storage list |
23 |
| Bucket mounted as FUSE filesystem at /data |
24 |
| Cluster torn down, optionally delete bucket |