CDNs are good. You get to put your web things all over the world and then have them served to your global audience from a location close to them. For example, because this blog is served through CloudFlare and about two thirds of the requests to my site come direct from their cache, you're probably downloading all the images on this page from whichever point in the map below is closest to you:
But what's even better than CDNs when it comes to cost and performance is public CDNs. For example, on Have I been pwned (HIBP) I serve various CSS and JavaScript files that are public libraries. It's stuff like jQuery and Bootstrap and my files are in no way unique to me, they're just the garden variety libraries adorning millions of sites the world over. One of the lessons I learned very early on in the life of HIBP was that it didn't make sense to serve these libraries from my site. Not only were my visitors getting zero CDN benefits due it all being hosted in the one West-US location, but I was paying for the bandwidth. In that link I lamented how I'd paid for 16GB of bandwidth I didn't need to just because my site was serving the public libraries. So I changed it to this:
Now I've got jQuery being served from CloudFlare's CDN which means firstly, I don't pay anything for it and secondly, users get the content served from somewhere locally and thirdly, if they've seen that script before on someone else's site then it may well be already cached anyway so they don't even have to download it. It's win-win-win, right?
And then the penny drops: "Hang on - so CloudFlare are now in control of the script that runs on my site?!". Yes, they are and they could change that script file to rewrite page contents, siphon off cookies or do any number of other really nasty things. Of course if they did it would be a massive story that would impact a huge number of websites (that is unless they tailored their "attack" to only focus on referrers from HIBP) and it would more likely be an issue if they themselves were to be compromised. This is not a hypothetical situation as precisely that happened to the Bootstrap CDN in 2013. It's a hell of a way to distribute and exploit when changing a single file can simultaneously push it out to a huge number of websites!
Delegating responsibility to others which in effect gives them control to run script on your website is understandably concerning, particularly for certain classes of web asset. But there's a way that lets you have your public CDN cake and eat it too, and that's subresource integrity, here forth referred to as SRI.
Let's imagine this: you decide you don't trust public CDNs so you decide to serve jQuery from your own site. Now let's assume that you have a jQuery file you actually trust because that in itself is pretty essential to this discussion. What we're going to do with SRI is tell the browser to load that version of the jQuery file from a public CDN but then, to make sure it's exactly the same as the one you know and trust. We're going to do this by generating a hash of the file then adding that as an attribute of the script tag that embeds it so that it looks like this:
The only difference whatsoever with this script tag is the "integrity" attribute (oh - and the crossorigin one but I'll come back to that later). What this now means is that when the browser loads jQuery from CloudFlare it's going to calculate the hash of the file (a bas64-encoded SHA384 hash in this case per the prefix in the attribute), compare it to the one described on the script tag and then only run it if it checks out. This works because you trust the script tag and it's attributes even if you don't trust the file served via the CDN. If the hash doesn't check out, the browser gets very upset:
I caused the browser to generate this error by using the hash for jQuery 2.2.3 then changing the file being loaded to request version 2.2.2. Different files, different hashes and this illustrates the point perfectly: the exact file I had told the web app to load courtesy of the integrity attribute wasn't loaded so the browser rejected it.
As for the "crossorigin" attribute, let's remove it and see what happens:
The CORS settings attribute is required for SRI when the request isn't from the same origin which of course it isn't when you're loading the JS off a public CDN. Setting the attribute to anonymous ensures no creds or identity info is sent with the request (i.e. basic auth or an auth cookie), which might seem redundant when the request is going off to another origin anyway, but in this case Chrome likes to be explicit.
In terms of actually generating the hash, you can choose to use a SHA256, SHA384 or SHA512 hash. You can trawl through the somewhat lengthy W3C recommendation for more details, but so long as you have one of these you should meet the browser requirement. One easy way of generating the hash is to do exactly what I did two screen caps further up - cause a failure. Chrome then kindly details the actual hash, this time of the SHA256 variety. Now you can take this value but you want to be confident you're using the hash of a file that you completely trust.
Another approach is to use openssl and point it at a local file, perhaps the one you've already been serving directly from your site that you completely trust. That would then look like this:
openssl dgst -sha384 -binary jquery.min.js | openssl base64 -A
Obviously that's passing in a jQuery file and while we're there, just one little gotcha: that's the minified version of jQuery and the hash will be different to the source file with all its whitespace glory so remember that when setting this up.
And finally, there's the SRI Hash Generator website:
I like this because you're simply giving it a URL such as the original file being served off CloudFlare's CDN then letting it work everything out for you. In this case, it responds like this:
But SRI doesn't just protect scripts from unexpected modification, you can use exactly the same integrity attribute (and crossorigin attribute) on style tags so CSS gets the benefit too. Here's a great demo of that: if you want to embed the Bootstrap CSS direct off their CDN, they'll even add the integrity attribute for you (probably a good move given their 2013 incident):
And jQeury does the same thing:
This is great, but remember this as it's pretty essential: you need to regenerate the hashes every single time you upgrade the library. If you're using jQuery or Bootstrap or any other resources you're protecting with SRI, you'll have to go back and use one of the approaches above to generate a new hash and whack it on your site or else stuff will break.
Now, a things things that often come up when I talk about SRI in my workshops: Firstly, this has nothing to do with SSL / TLS / HTTPS. That'll give you protection at the transport layer (stuff won't get modified while it's being sent) but it won't protect against the scenario where the source files on the CDN are maliciously modified. Secondly, if you're worried about someone modifying the hash that's added to your script tag on your server, you've got bigger problems! SRI isn't there to protect what's running on your back end. Thirdly, the performance impact of calculating the hash is so close to zero it doesn't even rate a mention (the only tangible figure I've ever heard is "about one millisecond").
So this is all good, right? I mean what's not to love?! Only really one little thing:
SRI simply doesn't feature on Microsoft's or Apple's browsers, not even the new ones or the upcoming ones. All good in Chrome, all good in Firefox but that's the extent of it (Opera doesn't count because their market share literally rounds to 0%). As much as I would like to see all these browsers get on board, there's a few things to remember here:
Firstly, SRI is presently in the "Recommendation" phase as you can see towards the top of that last image. The good news is that this phase is the last in the W3C's recommendation track which means the following:
A Recommendation reflects consensus within W3C, as represented by the Director's approval. W3C considers that the ideas or technology specified by a Recommendation are appropriate for widespread deployment and promote W3C's mission.
The bad news is that whilst it's "merely" a recommendation and not a ratified part of the spec, some browser manufacturers will play the waiting game. Obviously that's where Microsoft is at the moment and they've flagged SRI as "Under Consideration".
Secondly, SRI is still supported by more than half the browsers in global circulation at present, at lease according to the "Can I use" image above. Now you'll get different stats from different sources and of course your own audience is going to differ again, but the point is that the Chrome and Firefox markets alone are significant and all those users can get the benefit of SRI right now. As for those that can't, that brings me to my next point:
Thirdly and finally, if the browser doesn't support SRI... nothing happens. I mean nothing different happens in that the browser still loads the script, it just simply ignores the integrity tag. If the file is modified at the source and the hash doesn't match up then Internet Explorer, Edge and Safari do exactly the same thing as they'd do without any SRI at all - they run the script. The point is that you don't lose anything by implementing SRI when your audience loads a site using it with an unsupported browser, but you may gain something significant when a browser that does support it loads your site.
If you're unsure about support for your browser, go and give the W3C's test page a run. If you want to see it in action first hand, I've just pushed it up to HIBP for jQuery and Bootstrap so check it out there. If you'd like to see Microsoft and Apple add support, get it into your site and show there's actually a demand for SRI by virtue of sites implementing it!