Putting Azure CDN, Google Cloud Storage, Rackspace Cloud Files and Amazon CloudFront to the CORS test.
CORS (also known as Cross Origin Resource Sharing) is a standard that allows website owners to specify in the HTTP headers of files which domains (technically called origins) are allowed to access certain files on it’s server.
The server doesn’t enforce the lockout, but the users web-browser does. In fact, it is the browser that requests CORS validation in the first place. They do this for security reasons to prevent hacks like Cross Site Injection from adding to a page scripts or other malicious files into the HTTP response on an otherwise safe page.
Web fonts for example are a case where most browsers today require the server that has the font to respond to a CORS request.
By CORS request, I mean that when the browser downloads the font, in it’s request it adds the current domain name and scheme as the value of the Origin in the request header. So 'Origin: https://samkelleher.com' would be an example.
The server then looks to see if the origin requested is in the list of allowed origins, if it is, it responds with 'Access-Control-Allow-Origin: https://samkelleher.com' in the header. If the origin is not in the list, it simply does not return the Access-Control-Allow-Origin header.
Then when the browser interprets the response, if there is an Access-Control-Allow-Origin header, and it’s value also matches the origin the browser requested, then the browser accepts the file and will display it. So in the case of a font, the font will be used. If the header is missing or has a different origin value, the file is rejected and the browser will behave as if it couldn’t find the file.
This process is simple to setup and easy to configure on most server environments that you setup directly, but will become radically more complicated as you use a Content Deployment Network (CDN). First of it’s easy to confuse the origin in a CORS request to the origin a CDN server will use. To a CDN server, the origin is the domain/server that has the primary updatable copy of all the files. It’s where the CDN edge nodes will connect and download the files it wasn’t to serve to its local user base.
CDNs operate by a cache key, usually derived form the HTTP headers. In most cases, the cache key is simply the request URL. Meaning that any request to the same URL, will hit that same cache key, and the same response can be returned to the user without having to query the origin.
In CORS requests, we set a Vary: Origin header telling the CDN that different responses can be expected for different origin values. So the CDN should look at both the request URL AND the requests Origin Value. This is because the response headers will be different depending on what Origin was requested by the browser. So now if the URL and the request origin match the cache key, the same response will be quickly served from the CDNs local cache rather than hitting the origin server.
Finding a CDN provider that (correctly) supports the CORS protocol seems to be harder than it should. I tested the most popular CDN providers I know about, Azure CDN, Rackspace Cloud Files, and Amazon CloudFront. I tested a scenario where a wildcard origin was not used (which kind of defeats the purpose of using CORS in the first place), and where there is more than one valid origin (to reflect that most websites have a production and staging version which will want to access the same resources, like images and fonts).
Whilst all these providers claim to support CORS setups matching my desired configuration - I found significant broken flaws in all of them. All of these CDN providers operate on the concept of providing a storage container/bucket which functions as the origin, and then a CDN service they provide connects to this container and distributes copies of the files to the edge serves when they are requested.
The test was made of two parts, download a test font from the container directly, and then download that same file from the nearest CDN edge node, one before it was cached, and one after it was cached (where possible).
The requirements for a successful test is the CDN should always output the Vary: Origin header in the response (this part I had the help of CORS expert Monsur Hossain putting me in the right direction), and change it’s Access-Control-Allow-Origin value (including it where the origin was allowed, and omitting it when the origin was missing from the request, or was not an allowed domain). Simple right? If the Vary:Origin header is missing, then onward requests can end up caching the themselves an incorrect Access-Control-Allow-Origin value for the wrong origin request, causing a browser to incorrectly ignore a file.
The file was stored in the blobstore service also provided by Azure. On the first request the CDN edge node strips the Vary: Origin from the response and then caches what ever Access-Control-Allow-Origin value was present (if any). This breaks CORS, and thus the Azure CDN cannot operate correctly when hosting web resources like fonts.
Rackspace Cloud Files
The CDN is already integrated into the ‘files’ storage service. Configuring the container to support the CORS headers is made more complicated by the fact that what little documentation they provide seems to be outdated for their API, not to mention quite complex (I recommend their system architect take a look at the London API meetup group). As such, their CDN doesn’t support CORS as advertised, but only when used in conjunction with temporary dynamic URLs generated via one of their online services. The CDN doesn’t seem to have any CORS support at all.
The file is stored in an Amazon S3 bucket to which the CloudFront ‘distribution’ is then linked when it’s setup. Not sure if the problems I had was caused by another service outage at Amazon, but the responses were frankly bizarre. Here, the CDN would simply 307 Temporary Redirect to the internal blob url (even if it was private) . The S3 bucket however, did respond to CORS normally. After several hours of waiting, the CDN did finally start responding normally (maybe this is some provisioning wait time, even though the management console indicate the CDN had been ‘deployed’). When testing, while the Access-Control-Allow-Origin and Vary: Origin headers were present and correct when an allowed origin was requested, the CDN did not return a Vary: Origin header when responding to unknown or not-allowed origin requests. So this was with mixed results, while the CDN did respond with the correct access control headers, it stops any onward proxy from functioning correctly and will poising any cache used. I was able to test this by using a hotels free WiFi hotspot which has all it’s requests routed through a Squid proxy. The first request I made without an origin (and thus no vary header), the second request made with an allowed origin, but the proxy returned the headers cached from the first request.
Google Cloud Storage
The Google Cloud Storage doubles up as a regular blobstore, and any publicly accessible object is automatically enhanced with CDN coverage and thus will be accessed via Googles network of edge servers. This is the most simple setup. Enabling CORS on the container to be used is a simple matter of uploading a json configuration file using the gsutil program from the command line. With minimal effort, after just a few minutes reading the required documentation I enabled CORS, uploaded a public file and started testing. I was so happy to see the Vary: Origin response was return in every response. And the Access-Control-Allow-Origin header returned for matching Origin requests. All in all, Google Cloud Storage performed perfectly and passed the test.
So a sad state of affairs overall, only one of the CDN providers I tested has a validation implementation of CORS, and unsurprisingly it’s Google. I’m most concerned with Amazon CloudFront seeing as it’s the most established offering. Azure CDNs it the up-and-coming player from an awaking giant, and the Azure CDN Program Manager has been most helpful in dealing with my query and I’ll expect they will improve their CDN product the short term to enhance their CORS support.
I think I’ll move what cloud storage assets I have over to Google, away from Amazon or Azure for now. Sometimes it’s best to have a mix of hosting parts anyway for reliability - but then it does take away from the convenience of dealing with one provider whose servers often interact seamlessly with their other offerings.
Which CDN providers do you use, and have you checked them for CORS support? Leave your comments below.