Installing and Configuring the VAST Connector for Dremio

Prev Next

This topic describes how to configure Dremio to work with a VAST Database.

This involves the following:

  • installing Dremio

  • installing the VAST Dremio connector, which mediates between Dremio and the VAST Database.

Prerequisites

  • Java 11

  • Python 3.6

  • VAST Database

Installing Dremio

The VAST Dremio connector can work with either the Dremio Community (open source) version, or the enterprise Commercial version. The commercial version includes add-ons and security features which are not available in the community version.

The instructions below explain how to install the Dremio Community version. See dremio.com for details regarding the commercial version.

The connector was tested with Dremio v24.0.0, which can be downloaded from https://download.dremio.com/community-server/24.3.2-202401241821100032-d2d8a497/

  1. Run this command to create a Dremio user

    useradd -c "Dremio User" -u 4853 dremio
  2. Run these commands to install the Dremio community version

    dnf install ./dremio-community-24.3.2-202401241821100032_d2d8a497_1.noarch.rpm
    systemctl enable dremio
  3. Run these commands to set up Dremio work directories

    mkdir /opt/dremio/log /var/lib/dremio
    chown dremio:dremio /opt/dremio/log /var/lib/dremio
  4. Edit the file /etc/dremio/dremio-env and set the variable DREMIO_MAX_MEMORY_SIZE_MB to limit memory usage by Dremio. In this example, the limit is 200GB.

    DREMIO_MAX_MEMORY_SIZE_MB=200000
  5. Run this command

    systemctl start dremio

Downloading and Installing the VAST Connector

Download the latest Dremio Connector for VAST Database from here: https://github.com/vast-data/vast-dremio-connector/releases. The connector is a jar file.

Copy the Dremio Connector jar file to ${DREMIO_HOME}/jars.

Configuring the Connector

After downloading the VAST Dremio Connector, follow these steps to configure a Data Source on Dremio with the connector.

  1. Navigate to the Dremio UI (typically at http:/dremio_coordinator_host:9047), and then to Data Sets.

  2. Click Add Source to add a Data Source.

  3. In the Add Data Source dialog enter these details

    • Name - the name of the data source (any text)

    • Endpoint: An IP address in the VIP Pool on the VAST Cluster

    • Region: vast

    • Access Key: The Access Key for the service accessing VAST

    • Secret Key: the Secret key for the service accessing VAST

  4. It is recommended, but optional, to add these advanced settings:

    • Additional Endpoints: A list of IPs from the VAST VIP Pool on the Cluster. Additional endpoints improves performance.

    • Max Number of Splits: typically the number of CPU threads available for each worker; recommended value 32

    • Max Number of Subsplits: 4

  5. Click Save.

Using the Dremio and the VAST Dremio Connector to Access a VAST Database

  1. Run Dremio from your browser.

  2. Select the VAST Data Source. This will show a list of Databases on the VAST Cluster.

  3. Select a database.

  4. Create and run queries from the query bar. See https://docs.dremio.com for more details for using Dremio.