This topic describes how to configure Dremio to work with a VAST Database.
This involves the following:
installing Dremio
installing the VAST Dremio connector, which mediates between Dremio and the VAST Database.
Prerequisites
Java 11
Python 3.6
VAST Database
Installing Dremio
The VAST Dremio connector can work with either the Dremio Community (open source) version, or the enterprise Commercial version. The commercial version includes add-ons and security features which are not available in the community version.
The instructions below explain how to install the Dremio Community version. See dremio.com for details regarding the commercial version.
The connector was tested with Dremio v24.0.0, which can be downloaded from https://download.dremio.com/community-server/24.3.2-202401241821100032-d2d8a497/
Run this command to create a Dremio user
useradd -c "Dremio User" -u 4853 dremio
Run these commands to install the Dremio community version
dnf install ./dremio-community-24.3.2-202401241821100032_d2d8a497_1.noarch.rpm systemctl enable dremio
Run these commands to set up Dremio work directories
mkdir /opt/dremio/log /var/lib/dremio chown dremio:dremio /opt/dremio/log /var/lib/dremio
Edit the file /etc/dremio/dremio-env and set the variable
DREMIO_MAX_MEMORY_SIZE_MBto limit memory usage by Dremio. In this example, the limit is 200GB.DREMIO_MAX_MEMORY_SIZE_MB=200000
Run this command
systemctl start dremio
Downloading and Installing the VAST Connector
Download the latest Dremio Connector for VAST Database from here: https://github.com/vast-data/vast-dremio-connector/releases. The connector is a jar file.
Copy the Dremio Connector jar file to ${DREMIO_HOME}/jars.
Configuring the Connector
After downloading the VAST Dremio Connector, follow these steps to configure a Data Source on Dremio with the connector.
Navigate to the Dremio UI (typically at http:/dremio_coordinator_host:9047), and then to Data Sets.
Click Add Source to add a Data Source.
In the Add Data Source dialog enter these details
Name - the name of the data source (any text)
Endpoint: An IP address in the VIP Pool on the VAST Cluster
Region: vast
Access Key: The Access Key for the service accessing VAST
Secret Key: the Secret key for the service accessing VAST
It is recommended, but optional, to add these advanced settings:
Additional Endpoints: A list of IPs from the VAST VIP Pool on the Cluster. Additional endpoints improves performance.
Max Number of Splits: typically the number of CPU threads available for each worker; recommended value 32
Max Number of Subsplits: 4
Click Save.
Using the Dremio and the VAST Dremio Connector to Access a VAST Database
Run Dremio from your browser.
Select the VAST Data Source. This will show a list of Databases on the VAST Cluster.
Select a database.
Create and run queries from the query bar. See https://docs.dremio.com for more details for using Dremio.