VAST on Cloud for GCP

Prev Next

VAST on Cloud clusters in GCP are created using Terraform. Using Terraform files supplied by VAST, resources are created in a GCP project and then the cluster is installed on them.

Prerequisites

  • A GCP account with a GCP project, into which the Vast on Cloud Cluster will be deployed.

  • Terraform v1.5.4 or later

  • Google gcloud SDK

  • An SSH key pair

Configuring GCP for VoC

Configure the following in the GCP project, from  the GCP Console.

Enable Google Cloud APIs

Enable these Google APIs:

  • Compute API In Compute/VM Instances

  • Cloud Functions API. in Cloud Functions API

  • Cloud Build API, in Cloud Build

  • Secret Manager API, in  Security/Secret Manager API

Optionally, these APIs are used in many use-cases, and are recommended:

  • Artifact Registry API

  • Compute Engine API

  • Network Management API

  • Service Networking API

  • Network Security API

  • Cloud Monitoring API

  • Cloud Logging API

Set up Private Networking

In the VPC Networks page, configure Private services access to your VPC by Allocating IP Ranges for Services and Creating Private Connections to Service.

Set up NAT per Region

In Network Services/Cloud NAT, create a Cloud NAT Gateway with these details, for each region that has a VoC cluster:

  • Region: the region containing the cluster

  • Router: Create New Router

  • Network Tier Service: Premium

Configure Firewall Rules

In Network Security/FIrewall, configure the firewall policies as follows.

  • Create a firewall rule for cluster traffic with these details:

    • Direction: ingress

    • Action on match: allow

    • Target tags: add voc-internal (this tag is used by the VoC cluster)

    • Source tags:add voc-internal

    • Protocols and ports:

      TCP

      22, 80, 111, 389, 443, 445, 636, 2049, 3128, 3268, 3269, 4000, 4001, 4100, 4101, 4200, 4201, 4420, 4520, 5000, 5200, 5201, 5551, 6000, 6001, 6126, 7000, 7001, 7100, 7101, 8000, 9090, 9092, 9093, 20048, 20106, 20107, 20108, 49001, 49002

      UDP

      4005, 4105, 4205, 5205-5240, 6005, 7005, 7105

      ICMP

      enabled

    Leave all other rule settings as the defaults.

Quotas and Policy Constraints

Your GCP project should have these quotas:

  • Quota for Local SSD , This is set per region, and must allow for at least  9TB for local SSDs per DNode, The default quota is sufficient for only three DNodes.

    Note

    Increasing the default quota to a sufficient level for a VAST cluster deployment can take some time, and is not done instantly using the GCP Console UI.

  • Quota for n2 and n2d CPUs. These are set per region. A CNode requires 32 n2d CPUs. For a 100 CNode cluster, 3200 CPUs would be required. A DNode requires 16 n2 CPUs. For a 100 DNode cluster, 1600 CPUs would be required.

  • Quota for Static Routes per VPC Network. This is set per VPC network, This should allow for any IPs you use to connect to the cluster.

  • Quota for static routes per peering group. This is set per peering group (for all peered projects). Peered groups contain VPCs within the a common project, that can be connected. These connections require static routes. The quota should allow for all the connection routes between VPCs in the peering group.

To avoid problems when creating the cluster in GCP, organizational level policy constraints should not conflict with cluster requirements. For example, policies that restrict creation of n2 VMs.

Installing the gcloud SDK

Download the Google Cloud CLI from https://cloud.google.com/sdk/docs/install to your client machine, and follow the instructions to install it.

Installing Terraform

Download the latest version of Terraform from here and follow the installation instructions in it.

Checking the GCP Configuration

Optionally, download and run this script to check the GCP configuration. This uses Terraform to deploy a test cluster, and tests connectivity to and within the cluster.

  1. Download and extract the file https://vast-on-cloud.s3.eu-west-1.amazonaws.com/public_assets/voc-gcp-checker-1.0.6.zip  into a folder.

  2. Create and configure a Terraform variables file with these details:

    ## required variables
    zone       = "<zone>"
    subnetwork = "<subnetwork>"
    project_id = "<project_id>"
    ## variables with defaults, when not provided, these defaults will be used
    # onprem_ip    = ""
    # network_tags = []
    

    where zone, subnetwork, and project_id are from the GCP Project.

    Optionally, set these variables:

    Variable

    Description

    network_tags

    The network tags used with the cluster

    onprem_ip

    An on-prem IP address. If supplied, connectivity from the cluster to this address is tested .

  3. Run this command:

    ~/voc/gcp/gcp-checker > terraform init
  4. Run this command:

    ~/voc/gcp/gcp-checker > terraform apply

    The deploy on Terraform starts, and takes about 5 minutes to complete. When done, this output is shown:

    Apply complete! Resources: 5 added, 0 changed, 0 destroyed.
    
    Outputs:
    
    full_name = "checker-e516c43050c7ff71d929d4ab02c900d8"
    private_ip = "10.120.7.14"
    project_id = "voc-test"
    serial_console = "https://console.cloud.google.com/compute/instancesDetail/zones/us-central1-b/instances/checker-e516c43050c7ff71d929d4ab02c900d8-instance-8z5h/console?port=4&project=voc-test"
    subnetwork = "default"
    zone = "us-central1-b"
  5. Click on the link for serial_console in the output, to see details of the results. The results appear in the Google console in the serial port output. The output there appears like this:

    Waiting for up to 10 minutes for base connectivity before starting connectivity checks:
    
    connectivity to meta-data service (http://metadata.google.internal/computeMetadata/v1/instance/zone)                 ......
    
    connectivity to meta-data service (http://metadata.google.internal/computeMetadata/v1/instance/zone)                 ok      
    
    connectivity to Compute Engine API (compute.googleapis.com)                                                          ......
    
    connectivity to Compute Engine API (compute.googleapis.com)                                                          ok  
    
    connectivity to internal cluster instance (ping) (10.120.7.5)                                                        ......
    
    connectivity to internal cluster instance (ping) (10.120.7.5)                                                        ok      
    
    connectivity to internal cluster instance (port 22) (10.120.7.5)                                                     ......
    
    connectivity to internal cluster instance (port 22) (10.120.7.5)                                                     ok      
    
    connectivity to internal cluster instance (port 80) (10.120.7.5)                                                     ......
    
    connectivity to internal cluster instance (port 80) (10.120.7.5)                                                     ok      
    ...
    Connectivity check completed successfully

Configuring the VAST GCP Cluster in Terraform

You will receive a zip file from VAST that contains Terraform files that are used to create the VAST Cluster.

Extract the contents of the file into a folder. If you are creating more than one cluster, extract the contents of each zip file into a separate folder.

Create a file voc.auto.tfvars (use the file example.tfvars, from the zip file, as an example) with this content:

## required variables
name           = "<name>"
zone           = "<zone>"
subnetwork     = "<subnetwork>"
project_id     = "<project_id>"
nodes_count    = 8 # Minimum 8 - Maximum 14
ssh_public_key = "<public ssh key content>"
customer_name  = "<customer_name>"
## variables with defaults, when not provided, these defaults will be used
# network_tags                = []
# labels                      = {}
# ignore_nfs_permissions      = false
# enable_similarity           = false
# enable_callhome             = false

where name is the name of the cluster, zone, subnetwork, and project_id are from the GCP Project.

In ssh_public_key, enter your SSH public key, similar to this:

ssh-rsa AAAAB*****************************************************************************zrysUvp0EkI5YWm+lmiQP4edfNKo0G3udxeAGdrD9dZSlzqmtdvo7CTW7Qhh3v2T3t3tvTEQnnNx8CkQOFDuU3Eje7NiN1XTp5C14dcGfaZeJnRnwaKhyD710ZHTeRyzjoXhNoAOuPT4qrT4MZ4jUUjr8Fx3ozByPlLco7qHsXurZHdTFWmdR52PlWRZA++9uyjz/sPYO+HcHxtIT5yS7DVfQz8zFQTyL0Rk82v6S0HNlG31mMlA2cPt0/r2vpY0U2zfijHdZEGxu+XeR/xRmVhPFImxN0rl

Optionally, set these variables, or use the default settings:

Variable

Description

network_tags

(Optional) Add GCP network tags to the cluster

labels

(Optional) Add GCP labels to the cluster

ignore_nfs_permissions

If enabled, the VoC cluster will ignore file permissions and allow NFS and S3 clients to access data without checking permissions. Default: disabled.

enable_similarity

Enable this setting to enable similarity-based data reduction on the cluster. Default: disabled.Similarity-Based Data Reduction

enable_callhome

Enable this setting to enable the sending of callhome logs on the cluster. Default: disabled.Configuring Call Home Settings

Configuring VIPs for the Cluster

Allocate VIPs for the cluster on GCP in the VAST Web UI, in the VIP Pools section of the Network Access page. The VIPs added here should be routed to GCP, and not be in any GCP subnets, or belong to any CIDR assigned to any of the GCP subnets.

Creating the VAST GCP Cluster using Terraform

  1. Run the following command in the folder into which the zip file was extracted. This starts the Terraform deployment.

    ~/voc/gcp/gcp-new-deploy > terraform init

    When complete, the following is shown:

    Terraform has been successfully initialized!
  2. Run the following command to deploy the Vast on Cloud cluster.

    ~/voc/gcp/gcp-new-deploy > terraform apply

    When the Terraform action is complete, something similar to the following is shown:

    Apply complete! Resources: 2 added, 0 changed, 9 destroyed.
    
    Outputs:
    
    availability_zone = "us-central1-a"
    cloud_logging = "https://console.cloud.google.com/logs/viewer?project=voc-test&advancedFilter=resource.type%3D%22gce_instance%22%0Alabels.cluster_id%3Ddc66387e-c8bb-5bd8-97db-469392f6bdba"
    cluster_mgmt = "https://10.120.9.243:443"
    instance_group_manager_id = "test-manager"
    instance_ids = tolist([
      "1315258176142165158",
      "5926902481847174310",
      "4477224983631873190",
    ])
    instance_type = "n2-highmem-48"
    private_ips = tolist([
      "10.120.7.254",
      "10.120.8.0",
      "10.120.8.2",
    ])
    protocol_vips = tolist([
      "10.120.9.231",
      "10.120.9.232",
      "10.120.9.233",
      "10.120.9.234",
      "10.120.9.235",
      "10.120.9.236",
    ])
    replication_vips = tolist([
      "10.120.9.237",
      "10.120.9.238",
      "10.120.9.239",
      "10.120.9.240",
      "10.120.9.241",
      "10.120.9.242",
    ])
    serial_consoles = [
      "https://console.cloud.google.com/compute/instancesDetail/zones/us-central1-b/instances/test-instance-6873/console?port=1&project=voc-test",
      "https://console.cloud.google.com/compute/instancesDetail/zones/us-central1-b/instances/test-instance-955w/console?port=1&project=voc-test",
      "https://console.cloud.google.com/compute/instancesDetail/zones/us-central1-b/instances/test-instance-trl6/console?port=1&project=voc-test",
    ]
    vms_ip = "10.120.9.243"
    vms_monitor = "http://10.120.7.254:5551"
    voc_cluster_id = "dc66387e-c8bb-5bd8-97db-469392f6bdba"
    vpc_network = "voc-test"

    At this point, the cluster installation starts on the resources created by Terraform in the GCP project.

    Monitor progress of the installation at the vms_monitor URL The installation can take several minutes.

Accessing the Cluster in GCP

Access the Vast on Cloud VMS Cluster Web UI from a browser, using the  cluster_mgmt URL (from the terraform apply step, above).

The cluster is built in private subnets, so you will need to access the VMS from within your own address space with a route to the GCP subnets.

Destroying or Changing the Cluster Configuration

To destroy the cluster, run this command:

terraform destroy

If you want to change the settings in the voc.auto.tfvars file, you must first destroy and then rebuild the cluster using Terraform. Do not run terraform apply after making changes to the file - this will corrupt the cluster.

Warning

Data in the cluster is not preserved when the cluster is destroyed using Terraform (including when destroying it in order rebuild it).

Run the following commands  to rebuild the cluster after making changes to the file.

terraform destroy
terraform apply

Best Practices for Terraform Files

The terraform files contained in the zip file contain important information used by Terraform to create your cluster. Take care that these files are not deleted or corrupted.

As well, it is recommended to back these files up.

It is also  required that you use separate folders and files for each cluster you are provisioning on GCP using Terraform.