Importing Data to VAST Database

Use either of the following methods to fill in your VAST database with data:

Run a CTAS query from your query engine's client.
Insert data directly into a VAST database table.
Import data from Parquet files.

Running CTAS Queries

Using your query engine's client, connect to the data source where the data reside and run a command to make a CREATE TABLE AS SELECT (CTAS) query. A CTAS query creates a copy of the source table in the VAST database.

The syntax would be similar to the following:

CREATE TABLE <VAST database table> AS SELECT * FROM <data source table>

Inserting data into a VAST Database Table

To insert data directly into a VAST database table:

Create a VAST database table to which to insert the data using VAST Web UI or VAST CLI.
- In VAST Web UI, choose DataBase -> VAST DB, select a database and a schema in the database tree, and click the + Add Table button. Complete the fields in the dialog that opens and click Create.
  Note
  For a complete procedure, see Creating a Table via VAST Web UI.
- In the VAST CLI run the table create command.
Run an INSERT command from your query engine's client.
The syntax is like this:
```
INSERT INTO <VAST database table> SELECT * FROM <data source table>
```

Importing parquet files with Trino

You can fill in the VAST database with data from Parquet files contained in a VAST-stored S3 bucket, using the Trino client. The data is imported directly from the S3 bucket to the database table(s), keeping Trino out of the data path.

Tip
Before importing the data, ensure that the VAST database owner user has valid S3 access keys that provide access to the S3 bucket with the Parquet files.

Use the following command on the Trino client to insert partitioned parquet files into a VAST Database:

trino> insert into vast."db-bucket/myschema"."mytable vast.import_data()" (country, city, "$parquet_file_path") VALUES ('New York', 'New York City', '/db-bucket/myparquet'), ('New York', 'Manhattan','/db-bucket/myparquet2');

where db-bucket, myschema, and mytable are replaced with the database, schema, and table names.

This example imports parquet files without partitions:

trino> INSERT INTO vast."db-bucket/myschema"."mytable vast.import_data" ("$parquet_file_path") VALUES ('path/to/parquet/file')

Importing parquet files with Spark

Use the following command to import data from partitioned parquet files to a VAST database using Spark.

spark-sql > insert into ndb.`db-bucket`.myschema.`mytable vast.import_data(country, city)` (country, city, `$parquet_file_path`) VALUES ('New York', 'New York City', '/db-bucket/myparquet'), ('New York', 'Manhattan','/db-bucket/myparquet2');

where db-bucket, myschema, and mytable are replaced with the database, schema, and table names.

This example imports parquet files without partitions:

spark-sql> INSERT INTO ndb.`db-bucket`.myschema.`mytable vast.import_data()` (`$parquet_file_path`) VALUES ('path/to/parquet/file')