Tuesday, October 23, 2012

Proposing a new HTTP request header: max_timeout

I'm certainly not going to confess that my sites sometimes have technical problems. But a... friend...of mine tells me that servers can get bolloxed up. With a site like OvationTix (er, I mean: OshmationShmix), the most likely source of a site-wide problem will be congestion in the database. In such instances, the site will not be completely inaccessible, but it will be seriously slow. Users will say "the site's down," while engineers will say, "it's not down, exactly..." Situations like this can be frustrating, and become especially problematic when dealing with SLAs -- if pages that usually load in 2 seconds are instead loading in 2 minutes, is that "downtime," requiring remediation according to an SLA? I've seen some contracts stipulating that all pages must respond in under 3 seconds. That's good, but still too coarse-grained; what about a page (like a big report) which typically takes 20 seconds -- and which users expect will take a long time?

For this and other situations, I'm proposing a new HTTP request header: max_timeout (in milliseconds). After the timeout is reached, the browser will treat it as failed and inform the user accordingly. For users, this resolves ambiguity: currently, if a page loads slower than they're used to, they don't know whether it's likely to load in another second or two, or if it's stalled forever. For servers, this timeout can be used to abort processing of requests that exceed the threshold. (If the user isn't waiting for the response, there's no reason to keep executing the request.) Servers could also use this parameter to prioritize incoming requests: those that expect a subsecond response will go to the top of the pile, and those that say "hey, no rush" can be delayed.

How would the timeout be specified? We need to allow different requests to specify different timeouts. For the first request to a site, the timeout could be delivered via DNS (not saying this is easy to implement, but...): a response from a DNS server would contain the destination IP as well as the recommended timeout header to request. For subsequent pages, a parameter on an <a> tag or Ajax call would instruct the browser on how to construct the request header.

If a critical mass of sites/browsers/servers implement this request header, users will come to trust that the Internet won't leave them hanging -- if a server is taking longer to respond than it's supposed to, we'll time out the response so they can go about their business. Or, we could offer a prompt: "Your request has been cancelled because the website did not respond in a timely manner. Would you like to try again with no time constraint? [Yes|No]" Either way, having the confidence of a time limit would discourage users from abandoning early, or hitting "refresh" because the site feels slow. To that end, browsers could even refuse to re-issue a command until the timeout was reached. (Hang on, we're working on it...)