Targeted Cache Control
Content delivery networks (CDNs) have been around and have evolved over a long time (in internet years). They all speak HTTP and you can safely rely on them to work with just about anything else that speaks HTTP. This is the beauty of standards -- HTTP in this case. What you cannot count on is there being a standard way to configure them. In some cases, this is understandable: they all have different advanced features, after all. But when it comes to the basics, such as controlling how content is cached, it just makes sense to have one common way to do it. Standards tend to mean simpler documentation and fewer oddities (read: bugs). The winners are the users who end up saving time and gaining agility.
Targeted Cache Control is the result of CDNs working together to come up with a clear and simple tool to make it easy for origins to address the CDN layer. The result is even more general and powerful: a technique to target practically any layer in the delivery of HTTP content. The first field to be defined by the draft spec is the CDN-Cache-Control. A valid first reaction might be "What? Another mechanism for controlling cache?" But there are good reasons this is being added to the available tools. Read on to learn more.
What we have today
The standard header for controlling cache is aptly called Cache-Control. It provides a method for origins to indicate caching rules downstream. The original intent was between origin and browser. For example, a Cache-Control header in a response for a company's logo.jpg might look like:
That header would indicate to the browser to treat the object as valid for one day and use the cached response for any requests for that object in that time. If the world were as simple as browsers and origins, this would be sufficient. But what about things like reverse proxies? They are in common general use on the internet and often out of the direct control of the browser user and the origin. It is not uncommon that an origin might want to give proxies (or shared caches) a different directive. And thus s-maxage was born:
Cache-Control: max-age=86400, s-maxage=3600
Now the directive is telling the browser to cache for a day, but shared caches/proxies to cache for one hour. This is an improvement. It might make sense at this point to say that since a CDN is conceptually a shared cache, then s-maxage is the ticket -- and the draft spec and this blog post are redundant. But it is important to remember that a CDN is not just a shared cache. As stated above, these shared caches are generally out of the control of the origin, whereas a CDN is technically a "Surrogate" working on behalf of the origin. It is common to want to control the CDN differently than the browser as well as any shared caches in the way.
SIDEBAR: Surrogates: Do I say "Surrogate"? Those who are wise in the arcana of the web might know of the Surrogate-Control header. This would be ideal except for the fact that the specification was a bit open to interpretation and the web is littered with incompatible implementations. Trying to herd those cats would be a sisyphean task and could doom the effort before it started.
One last look into the mind of standards development: "Let's just use cdn-maxage." This has two problems. First, there are a whole host of other cache-directives that would need to be "CDNified" as well: no-cache, no-store, private, etc. That name space gets messy pretty quickly. Second, you would need to clearly define what should happen if there is an error in a directive not targeted at the CDN. For example, how do you handle.
Cache-Control: max-age=86400xx, cdn-maxage=3600
With the invalid characters in the max-age directive, should the whole header be ignored, or should the CDN be able to parse and use the section directed at it? This can certainly be specified, but it gets complicated in a hurry.
And thus was born Targeted Cache Control.
Targeted Cache Control
Targeted Cache Control (TCC) defines the method for sending cache directive to specific actors in the flow of servicing an HTTP response. It uses the same syntax as Cache-Control, which makes documenting and implementing the new headers trivial. The generic form is:
<TARGET>-Cache-Control: <targeted cache control directives>
<TARGET> is a prefix that defines who should be listening to the directives. The TCC draft specification defines one prefix to start: CDN. The CDN-Cache-Control header is a TCC header that (from the spec):
"... allows origin servers to control the behaviour of CDN caches interposed between them and clients, separately from other caches that might handle the response."
This is a good thing seeing how this whole post was leading to precisely that. Now an origin can send:
Cache-Control: must-revalidate CDN-Cache-Control: max-age=1209600
This allows finer control over how content is cached. In this case, the CDN layer can cache the control for two weeks and the browser (and anything else listening) needs to revalidate the content every time. It might be load or performance prohibitive to do such a thing if those requests were going to the origin, but this is what CDNs were built for!
And the fun does not stop there. The TARGET can be essentially anything, and a common use case will be to target a specific CDN. For the above example:
Cache-Control: must-revalidate Akamai-Cache-Control: max-age=1209600
Now the origin is giving specific controls to Akamai. If there is only one possible CDN this might seem redundant, but the following illustrates where this becomes useful:
Cache-Control: must-revalidate CDN-Cache-Control: no-store Akamai-Cache-Control: max-age=1209600 Now it says the browser needs to check for valid content, and all CDNs should treat the content as no-store (uncachable), unless the CDN is Akamai, in which case cache it for two weeks.
SIDEBAR: Edge-Control: Akamai has had the Edge-Control header for some 20 years. In effect TCC is a standardization of that header. Edge-Control will continue to be supported, but the TCC version will be encouraged.
The benefits of standardization
Targeted Cache Control was co-authored by Akamai, Fastly, and Cloudflare. This is significant because it shows that these companies recognize that working with standards is a benefit to all. When a bit of functionality is standardized, it allows customers of CDNs to target the standard rather than a specific CDN's documentation. The guesswork is gone. The reality of today is that many companies use two or more CDNs, or may switch between them. Making at least some of the controls for those CDNs common reduces the cost of ownership for those companies.
This is not the first example of CDNs working together. Previously the same three CDNs worked on the CDN Loop Detection draft to make it possible to find and break potentially damaging and wasteful request loops between them. Though not as directly useful to origin servers as Targeted Content Control, it addresses a problem that would be otherwise difficult to detect and debug.
CDNs are a critical link in today's internet. I look forward to continued collaboration for the benefit of all.