Path and Virtual-Hosted Style S3 URLs

Prev Next

The S3 protocol is based on HTTP and supports 2 general formats for creating HTTP URLs from a bucket name and an object key: Virtual-Hosted–style and Path-style URLs.

Path-Style URLs

Path-style is the original format used by AWS S3, and whilst AWS has deprecated its use (due to the globally distributed nature of AWS), it’s still generally the preferred and simplest option for on-premises S3 object stores.

Path-style URLs take the format of  :

https://<s3-endpoint-hostname>/<bucket-name>/<object-key>

eg, accessing the object MyObj1 in the bucket bucket1 at the endpoint s3.example.com would result in the following URL :

https://s3.example.com/bucket1/MyObj1

With Path-Style URLs, the endpoint name (ie, the hostname of the S3 service) is generally not used by the S3 server as a part of determining what is being accessed, so using a different hostname that points to the same IP address, or even using an IP address itself, will also work.  eg, if s3.example.com mapped to the IP address 172.31.1.1, then the above object could also be accessed as

https://172.31.1.1/bucket1/MyObj1

Virtual-Hosted-Style URLs

Virtual-hosted-style URLs are the preferred style for distributed clouds like AWS, mainly because they contain the bucket name in the hostname, and thus they allow the provider to use DNS to direct the client to the correct region/node hosting that bucket.  This is generally not required for on-premise S3 systems; virtual-hosted-style URLs are still normally supported by such systems, including Vast.

Virtual-host-style URLs take the format of :

https://<bucket-name>.<s3-endpoint-hostname>/<object-key>

eg, accessing the object MyObj1 in the bucket bucket1 at the endpoint s3.example.com would result in the following URL :

https://bucket1.s3.example.com/MyObj1

As the bucket name is included in the hostname, it’s not possible to use an IP address as the endpoint 'hostname' for virtual-style requests. Putting the bucket name in front of the IP address (bucket1.172.31.1.1) is obviously not valid, and as the bucket name appears nowhere else in the request, simply removing it from the hostname also is not an option!  As a result, virtual-hosted-style URLs can only be used when using hostnames to access the S3 server, and not IP addresses.

Determining which style is in use

Unfortunately, when AWS implemented virtual-hosted-style URLs, they did not include any way for the S3 server to know which format is being used - instead, the server needs to attempt to ‘guess’ which format is being used.  For example, consider the following request :

https://prod.s3.example.com/data/obj1

This could be a path-style URL request for the object obj1 in the bucket data to the S3 server at prod.s3.example.com.  Or it could be a virtual-hosted-style request for the object data/obj1 in the bucket prod on the S3 server at s3.example.com

As the client sends no indication of which URL scheme is in use, the server has no option other than to make an (educated?) guess as to which of the two options above the client is actually requesting.

VAST URL Style Determination

Vast uses the Domain Name configured within the Vast DNS Server to determine if a request is using a virtual-hosted-style URL. The Vast DNS Server does not actually have to be used to service DNS for the request (but see below for more details on what is needed for DNS), it must be at least configured for virtual requests to work.

Both the “DNS Service Suffix” (configured as a part of the DNS Server configuration) and all “Virtual IP Pool Domain names” (configured in the VIP Pool configuration) will be used.

For example, if the cluster is configured with the domain name s3.example.com and there are two VIP pools configured with their DNS names set to vip1.s3.example.com and vip2.s3.example.com then any requests received for the following hostnames will be considered virtual-hosted-style requests for the bucket ‘bucket1’ :

  • bucket1.s3.example.com

  • bucket1.vip1.s3.example.com

  • bucket1.vip2.s3.example.com

Other variations on these names will NOT be treated as virtual-hosted-style requests.  eg, bucket1.example.com, bucket1.vip9.s3.example.com (where vip9 is not a configured VIP Pool DNS Domain name), s3.bucket1.example.com, etc, and such requests will be parsed as path-style URLs instead.

DNS Servers and TLS Certificates

Using Virtual-hosted-style URLs requires additional configuration for the DNS server and TLS certificates used - these no longer need to just match the endpoint hostname, but also need to match that hostname with a bucket name prepended.

The built-in Vast DNS Server handles this by responding to requests for any hostname within the Domain name configured within a VIP pool.  eg, if a VIP pool has a DNS Domain name configured as vip1.s3.example.com, then in addition to responding to that exact hostname, it will also respond to <any-hostname>.vip1.s3.example.com allowing the prepended hostname to match any bucket name.

If you’re using a server other than the Vast DNS Server, it will need to be configured to also respond to wildcard requests for the domain being used to allow virtual-hosted-style URLs to work.

TLS Certificates will also need to match the hostname the client is using to make the request - including the bucket name.  Obviously, it’s not possible to include every bucket name in the certificate, so a Wildcard Certificate must be used.  Wildcard certificates only allow the ‘wildcard’ to match a single hostname segment (ie, the wildcard itself will not match a ‘dot’ in the hostname), so using the example above of a VIP Pool with a DNS name of vip1.s3.example.com, the SSL certificate would need to include BOTH of the following :

  • vip1.s3.example.com   (for matching path-style requests)

  • *.vip1.s3.example.com (for matching virtual-hosted-style requests)

If a 2nd VIP pool was added (eg, vip2.s3.example.com) then a further 2 hostnames would need to be added to the certificate - one with a preceding wildcard, and a second without.