Using Envoy with VAST S3

Prev Next

This guide outlines the foundational aspects of using Envoy as a proxy/load balancer for VAST Data S3 storage.
Envoy helps by providing advanced load-balancing capabilities that distribute client requests more efficiently across multiple CNodes by configuring multiple VIPs.
Flow Diagram:

The diagram illustrates an Envoy proxy architecture, where connections from customer networks pass through its listener and network filter chain before being distributed across a cluster of CNODEs (Consistent Hashing Nodes) and DNODEs using load balancing mechanisms. Each node within this vast cluster is responsible for handling or routing requests in a scalable manner to ensure efficient service delivery.


Getting Started

Installing Envoy on a Debian client:

wget -O- https://apt.envoyproxy.io/signing.key | sudo gpg --dearmor -o /etc/apt/keyrings/envoy-keyring.gpg
echo "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/envoy-keyring.gpg] https://apt.envoyproxy.io bookworm main" | sudo tee /etc/apt/sources.list.d/envoy.list
sudo apt-get update
sudo apt-get install envoy
envoy --version

Running Envoy on Docker:

docker pull envoyproxy/envoy:dev-68e64b4f7f834183cb7ba0ff96f0536dd50e7e92
docker run --rm envoyproxy/envoy:dev-68e64b4f7f834183cb7ba0ff96f0536dd50e7e92 --version

For more Installation options: Getting Started — envoy

Basic HTTP Envoy Configuration for VAST Data S3

Example Envoy configuration YAML file that defines a basic reverse proxy setup:

static_resources:
  listeners:
  - name: listener_0
    address:
      socket_address:
        address: 0.0.0.0
        port_value: 10000
    filter_chains:
    - filters:
      - name: envoy.filters.network.http_connection_manager
        typed_config:
          "@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
          stat_prefix: ingress_http
          codec_type: AUTO
          route_config:
            name: local_route
            virtual_hosts:
            - name: backend
              domains:
                - "*"
              routes:
              - match:
                  prefix: "/"
                route:
                  cluster: vast_data_cluster
          http_filters:
          - name: envoy.filters.http.router
            typed_config:
              "@type": type.googleapis.com/envoy.extensions.filters.http.router.v3.Router
          common_http_protocol_options:
            idle_timeout: 60s
            headers_with_underscores_action: REJECT_REQUEST

  clusters:
  - name: vast_data_cluster
    connect_timeout: 0.25s
    type: STRICT_DNS
    lb_policy: ROUND_ROBIN
    health_checks:
      - timeout: 3s
        interval: 10s
        unhealthy_threshold: 3
        healthy_threshold: 2
        http_health_check:
          path: "/"
    load_assignment:
      cluster_name: vast_data_cluster
      endpoints:
      - lb_endpoints:
        - endpoint:
            address:
              socket_address:
                address: 172.27.77.1
                port_value: 80
        - endpoint:
            address:
              socket_address:
                address: 172.27.77.2
                port_value: 80

admin:
  access_log_path: "/tmp/admin_access.log"
  address:
    socket_address:
      address: 127.0.0.1
      port_value: 9901

Basic HTTPS Envoy Configuration for VAST Data S3

Example HTTPS Envoy configuration YAML file that defines a basic reverse proxy setup:

static_resources:
  listeners:
  - name: listener_0
    address:
      socket_address:
        address: 0.0.0.0
        port_value: 10000
    filter_chains:
    - filters:
      - name: envoy.filters.network.http_connection_manager
        typed_config:
          "@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
          stat_prefix: ingress_https
          codec_type: AUTO
          route_config:
            name: local_route
            virtual_hosts:
            - name: backend
              domains:
                - "*"
              routes:
              - match:
                  prefix: "/"
                route:
                  cluster: vast_data_cluster
          http_filters:
          - name: envoy.filters.http.router
            typed_config:
              "@type": type.googleapis.com/envoy.extensions.filters.http.router.v3.Router
          common_http_protocol_options:
            idle_timeout: 60s
            headers_with_underscores_action: REJECT_REQUEST
      transport_socket:
        name: envoy.transport_sockets.tls
        typed_config:
          "@type": type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.DownstreamTlsContext
          common_tls_context:
            tls_certificates:
              - certificate_chain:
                  filename: "/tmp/certificate.crt"
                private_key:
                  filename: "/tmp/private.key"
  clusters:
  - name: vast_data_cluster
    connect_timeout: 0.25s
    type: STRICT_DNS
    lb_policy: ROUND_ROBIN
    transport_socket:
      name: envoy.transport_sockets.tls
      typed_config:
        "@type": type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.UpstreamTlsContext
        sni: s3-hostname.com
    health_checks:
      - timeout: 3s
        interval: 10s
        unhealthy_threshold: 3
        healthy_threshold: 2
        http_health_check:
          path: "/"
    load_assignment:
      cluster_name: vast_data_cluster
      endpoints:
      - lb_endpoints:
        - endpoint:
            address:
              socket_address:
                address: 172.27.77.1
                port_value: 443
        - endpoint:
            address:
              socket_address:
                address: 172.27.77.2
                port_value: 443

admin:
  access_log_path: "/tmp/admin_access.log"
  address:
    socket_address:
      address: 127.0.0.1
      port_value: 9901

ℹ️ Note

The above https was tested with basic self-signed certs, generated on the Envoy client.

openssl genpkey -algorithm RSA -out private.key
openssl req -new -key private.key -out csr.pem
openssl req -x509 -days 365 -key private.key -in csr.pem -out certificate.crt

HTTPS S3 Enable on VAST Side

In order to test and validate the https flow, it is possible to enable/disable http/https communications on VAST S3: Settings → S3

The screenshot shows the S3 configuration settings in a cloud-based management console, where options such as enabling HTTPS and Bucket replication can be toggled on or off to secure data storage and ensure data persistence across clusters.


Breakdown of Envoy Configuration YAML

  • Listeners

    • Purpose: Defines where Envoy listens for incoming traffic

    • Key Points:

    • Address 0.0.0.0:10000: Listens on all network interfaces at port 10000.

    • Filter Chains: Processes incoming requests using specified filters.

  • HTTP Connection Manager Filter

    • Purpose: Manages all HTTP connections.

    • Key Points:

    • Stat Prefix: Labels statistics for monitoring.

    • Route Configuration: Defines how requests are routed.

    • HTTP Filters: Processes HTTP requests and responses.

  • Routing Configuration

    • Purpose: Directs incoming requests to appropriate backends

    • Key Points:

    • Virtual Hosts: Handles domains (* matches all domains).

    • Routes: Routes all paths (/) to the vast_data_cluster.

  • Clusters

    • Purpose: Defines groups of similar backend services.

    • Key Points:

    • Type STRICT_DNS: Resolves the backend address using DNS.

    • Load Balancing Policy ROUND_ROBIN: Distributes requests evenly across all servers.

    • Endpoints: Actual VIP addresses of the CNode’s (172.27.77.1, 172.27.77.2 on port 80).

      • Add as many VIP’s as needed

  • Health Checks

    Envoy uses health checks to determine if the instances in the cluster are healthy and able to handle requests.

    • timeout: time Envoy waits for a response from the health check before considering it a failure. If the health check response isn't received within this time, the health check is marked as failed.

    • interval: time between each health check request.

    • unhealthy_threshold: determines the number of consecutive failed health checks required before a host is marked as unhealthy. if consecutive health checks fail, the host is considered unhealthy and will not receive traffic until it passes health checks again.

    • healthy_threshold: number of consecutive successful health checks required for a previously unhealthy host to be considered healthy again. it will be marked as healthy and can start receiving traffic again.

    • http_health_check: type of health check; it's an HTTP health check. The path: "/" setting tells Envoy to make HTTP requests to the root path of the host to check its health. Envoy expects a successful HTTP response (typically HTTP 200 OK) to consider the health check successful.

  • Admin

    • Purpose: Provides a management interface for Envoy.

    • Key Points:

      • Access Log Path: Where to store log files.

      • Admin Address 127.0.0.1:9901: Local interface for accessing the admin panel.

Setting the Endpoint

  • Endpoints in Clusters: Define the specific addresses of the backend servers to which traffic is routed, effectively setting the proxy's outgoing endpoint.

This configuration ensures Envoy listens on port 10000 for all incoming HTTP requests, routes them according to predefined rules, and forwards them to the specified backend servers, while providing robust monitoring and management capabilities through its admin interface.

Envoy Configuration Changes Modes

In a fully static Envoy configuration, all elements, including listenersfilter chains, and clusters, are defined upfront in the configuration file.
This setup relies on DNS for any dynamic host discovery. When updates are necessary, such as adding endpoints or changing configurations, Envoy's built-in hot restart mechanism must be used to reload the configuration without dropping connections. This approach allows for the management of fairly complex deployments using static configurations, facilitated by graceful hot restarts, which enable the proxy to maintain service continuity during updates.

For more Info: Hot restart — envoy 1.34.0

Envoy also supports dynamic configuration through its xDS APIs. The Endpoint Discovery Service (EDS) is particularly useful for dynamically managing endpoints without needing to restart Envoy. This allows endpoints to be added or removed in real time as the management server sends updated information directly to Envoy, which seamlessly integrates these changes into its operational routing decisions.

For more Info: xDS configuration API overview — envoy 1.34.0

Initialization

Starting Envoy

Initialize Envoy with your configuration file using the command envoy -c config.yaml. Ensure the service starts without errors and can connect to VASTData S3.

  •  envoy -c /home/vastdata/envoy_http_lb.yaml --log-level info
  • Note: If running Envoy with Docker:

    • docker run --name envoy \
      --network host \
      --rm \
      -v /path/envoy_http_lb.yaml:/tmp/envoy_http_lb.yaml \
      envoyproxy/envoy:dev-68e64b4f7f834183cb7ba0ff96f0536dd50e7e92 \
      envoy -c /tmp/envoy_http_lb.yaml --log-level info
  • Note: If running HTTPS Envoy with Docker:

    • docker run --name envoy \
      --network host \
      --rm \
      -v /path/envoy/:/tmp/ \
      --user $(id -u):$(id -g) \
      envoyproxy/envoy:dev-68e64b4f7f834183cb7ba0ff96f0536dd50e7e92 \
      envoy -c /tmp/envoy_https_lb.yaml --log-level info
      • It is expected for https to have the next structure:

        -rw-rw-r-- 1 admin_access.log
        -rw-rw-r-- 1 certificate.crt
        -rw-rw-r-- 1 csr.pem
        -rw-r--r-- 1 envoy_https_lb.yaml
        -rw------- 1 private.key

Validating Traffic distribution on the VAST side with Elbenho

ℹ️ Note

In this section it is considered you already have:

  • Set up VAST Cluster & Virtual IP Pool with VIp’s distributed on different CNode’s

  • Have s3 bucket with sufficient permissions

  • Have s3 bucket owner with the proper Access/Secret keys

Elbenco Quick Setup

wget https://github.com/breuner/elbencho/releases/download/v3.0-25/elbencho-static-x86_64.tar.gz
tar -xf elbencho-static-x86_64.tar.gz
./elbencho --help

For more info: https://github.com/breuner/elbencho/releases/tag/v3.0-25

  • Create a minimal access/secret config file:
    cat vast_cfg_elbencho.txt

    s3key=xxx
    s3secret=xxx

Client Configuration

  • In this example, we have a client with 64 cpu cores & 50G/s Nic connected to the VAST cluster.

    $ ethtool ens1f0np0 | grep Speed
    netlink error: Operation not permitted
    	Speed: 50000Mb/s
    
    $ lscpu | grep CPU
    CPU op-mode(s):      32-bit, 64-bit
    CPU(s):              64

Running a load on VAST S3 with Elbenco

This  elbencho benchmark will be writing files to VAST S3 and generating traffic.
Note: These are optimized values to squeeze the max rate for the given client

./elbencho -c vast_cfg_elbencho.txt --s3endpoints http://127.0.0.1:10000 -s 100g -b 256m -t 64 -w -d mybucket/myobject{1..64}

For https envoy endpoint;

./elbencho -c vast_cfg_elbencho.txt --s3endpoints https://127.0.0.1:10000 -s 100g -b 256m -t 64 -d -w envoy/elbencho{1..64}

Breakdown of elbencho flags used

  • --s3endpoints http://your-s3-endpoint: specifies the endpoint of the S3.
    If using Envoy - will set here the configured load-balanced endpoint.

  • -c: S3 creds config file.

  • -s 100g: size of each object.

  • -b 256m: size of each multipart upload part.

  • -t 64: up to 64 concurrent operations.

  • -w: instructs elbencho to perform write operations.

  • -d: create the directory.

  • mybucket/myobject{1..64}: indicates the target bucket and a pattern for naming objects.

Showcase Linear Scaling of a Client Line Rate to VAST S3

Using iftop we can see the client is reaching close to the network's capacity of 50 Gbps.  
sudo iftop -i ens1f0np0

  • ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
    rates:   47.2Gb  46.1Gb  46.8Gb
             24.7Mb  24.5Mb  22.9Mb
             47.2Gb  47.2Gb  46.7Gb


This is How The Data Flow will look on VAST after enabling load-balancing and approaching the defined endpoint:

The image illustrates a data flow visualization tool, showing real-time network activities with connections between users and hosts, sorted by bandwidth usage in MB/s. It includes multiple filters such as user, protocol, VIP, Cnode, and View to isolate specific flows effectively.

Envoy Stats and Key Stats Monitoring

  • Accessing Stats: Use curl http://127.0.0.1:9901/stats to fetch metrics from Envoy.

  • Key Stats:

    • upstream_rq_total: Total requests sent to the backend.

    • downstream_cx_total: Total connections handled by Envoy.

    • upstream_cx_destroy: Upstream connections closed.

  • Memory Monitoring: Track memory usage with curl http://127.0.0.1:9901/memory.

Combining Access to Multiple VAST Clusters

Generally, an Envoy instance would be configured to forward traffic to a single VAST cluster; it is possible to use Envoy to route traffic to multiple VAST clusters based on the bucket name being accessed.

S3 URLs, when using Path-style buckets as are normally used in on-premise deployments of S3, use a URL format of :http://<s3-endpoint-hostname>/<bucketname>/<objectkey

Both the <objectkey> and <bucketname> may be optional depending on the request.  For example, a "ListBucket" request would include neither, whilst a bucket-level feature such as GetBucketTags would include only the bucketname.

This fixed URL format allows a reverse proxy server, such as Envoy, to route requests to different upstream servers based on the bucket name using the first element of the URL path in a routing rule.

Envoy supports using path_separated_prefix in a HTTP route rule, which will match whole path components at the start of the path - corresponding to the bucket name.  A separate path_separated_prefix-based rule would be required for each bucket, routing to an Envoy backend "cluster" corresponding to the VAST cluster that the bucket is hosted on.  These rules would need to be updated as new buckets are added (or moved between clusters). Documentation on such rules can be found at https://www.envoyproxy.io/docs/envoy/latest/api-v3/config/route/v3/route_components.proto#envoy-v3-api-field-config-route-v3-routematch-path-separated-prefix. The following route rules would route traffic for buckets "bucket1" and "bucket2" to one VAST cluster, whilst "bucket3" would be routed to a different cluster.

              routes:
              - match: { path_separated_prefix: "/bucket1" }
                route: { cluster: vast-cluster-1 }
              - match: { path_separated_prefix: "/bucket2" }
                route: { cluster: vast-cluster-1 }
              - match: { path_separated_prefix: "/bucket3" }
                route: { cluster: vast-cluster-2 }

Users would send all requests to Envoy, which would then forward them to the correct cluster based on the rules above.  From the end-user perspective, it would appear that all buckets were hosted on a single endpoint.

NOTE:

  • S3 Access/Secret Keys would need to be sync'ed across all clusters, as the user will not be aware of which cluster they are accessing.

  • This functionality requires the application to be configured to use "Path-style" URLs for access, which is the default for most clients when accessing a non-AWS S3 server.  Virtual-hosted–style URLs will not work (More details on the distinction between these two can be found at Path and Virtual-Hosted Style S3 URLs

  • Some requests, such as "ListBucket", do not include a bucket name and thus will not be routed using these rules.  An additional default route could be used to send these requests to a single cluster; the results would then only be relevant for that one cluster (eg, ListBuckets would return only buckets on that one cluster)

  • The Envoy configuration (on all systems it was running on) would need to be updated when a new bucket is created. Envoy supports dynamically loading new configurations, so this could be done without end-user impact.  New buckets would need to be created via the VAST GUI/API - "CreateBucket" requests via the S3 API to Envoy would not work, as the bucket would be unknown, so it would not know where to forward the request.