HTTP response code nuttiness

The technique of using a 404 error handler to actually render a web page from some underlying source or data is widely used. We use it on the java-gnome website to render prettily marked-up versions of our text meta documents (NEWS, README, HACKING, etc). The document you’re asking for (“README.html“) doesn’t exist, but a text source (“README“) does and so it reads that file in, runs the markup processor, and then serves the result. Yawn.

If you’re already in an error handler, then the default status code as far as the web server is concerned is, of course, HTTP 404. So you need to change that. In PHP, that’s as simple as:

<?
    header("HTTP/1.1 200 OK");
?>

which is actually treated a bit magically to tweak the response code. Too easy. This worked fine on our test site and on the production site for, well, years.

We ran into a weird problem that at some point over the last few weeks that suddenly I was seeing garbage appearing at the beginning and end of documents being so rendered on our SourceForge website. Huh?

I still have no idea what is wrong, and won’t detail all the diagnostics it took to track down the problem, but I ended up isolating it to the header() call. If I skipped that, then the document would render and serve correctly. But that’s no good, because then you are serving documents with an HTTP 404 error code, which will inhibit search indexing and caching even if it doesn’t screw with your browser.

The workaround ended up being this:

<?
    header("HTTP/1.0 200 OK");
?>

I have no idea why, but with whatever SourceForge has done to their webservers lately, this made it work fine. Go figure.

The real inanity, though, is that using wget to fetch the document while showing HTTP headers the result is this:

$ wget -x -S http://java-gnome.sourceforge.net/4.0/README.html
--2008-10-13 14:11:28--  http://java-gnome.sourceforge.net/4.0/README.html
Resolving java-gnome.sourceforge.net... 216.34.181.96
Connecting to java-gnome.sourceforge.net|216.34.181.96|:80... connected.
HTTP request sent, awaiting response... 
  HTTP/1.1 200 OK
  Server: nginx/0.6.31
  Date: Mon, 13 Oct 2008 03:11:29 GMT
  Content-Type: text/html; charset=UTF-8
  Connection: close
  X-Powered-By: PHP/5.2.6
  Last-Modified: Sun, 12 Oct 2008 14:02:37 GMT
Length: unspecified [text/html]
Saving to: `java-gnome.sourceforge.net/4.0/README.html'

    [    <=>                                ] 16,052      22.3K/s   in 0.7s    

2008-10-13 14:11:30 (22.3 KB/s) - `java-gnome.sourceforge.net/4.0/README.html' saved [16052]
$

Their web server sent an HTTP/1.1 response anyway! Talk about adding insult to injury. What a waste of time.

I’m sure this is just one of those transient conditions that’ll be gone by this time next week, but it sure as hell frustrated me. If you’re seeing something weird going on with PHP pages on one of your SourceForge sites, perhaps have a look and see if this workaround helps.

{shrug}

AfC