Package: www.debian.org Version: 20040220 Severity: minor www.debian.org appears to have a problem when an asterisk is present in the Accept-Language header of an incoming HTTP request.
When I do a HEAD /support without an Accept-Language header all is well: $ sed -e 's/^ //' -e 's/$/ /' <<==HERE | > HEAD /support HTTP/1.1 > Host: www.debian.org > Connection: close > > ==HERE > nc www.debian.org 80 HTTP/1.1 200 OK Date: Fri, 20 Feb 2004 04:47:06 GMT Server: Apache/1.3.26 (Unix) Debian GNU/Linux PHP/4.1.2 DAV/1.0.3 Content-Location: support.en.html Vary: negotiate,accept-language TCN: choice Cache-Control: max-age=86400 Expires: Sat, 21 Feb 2004 04:47:06 GMT Last-Modified: Tue, 17 Feb 2004 02:33:35 GMT ETag: "11301e9-3b67-40317d7f;403575e1" Accept-Ranges: bytes Content-Length: 15207 Connection: close Content-Type: text/html Content-Language: en But if I add Accept-Language: * I get a 404: $ sed -e 's/^ //' -e 's/$/ /' <<==HERE | > HEAD /support HTTP/1.1 > Host: www.debian.org > Accept-Language: * > Connection: close > > ==HERE > nc www.debian.org 80 HTTP/1.1 404 Not Found Date: Fri, 20 Feb 2004 04:47:44 GMT Server: Apache/1.3.26 (Unix) Debian GNU/Linux PHP/4.1.2 DAV/1.0.3 Content-Location: support.nb.html Vary: negotiate,accept-language TCN: choice Connection: close Content-Type: text/html; charset=iso-8859-1 Then again if I add actual languages to the Accept-Language header all is well again: $ sed -e 's/^ //' -e 's/$/ /' <<==HERE | > HEAD /support HTTP/1.1 > Host: www.debian.org > Accept-Language: sv-FI, i-navajo, en-US > Connection: close > > ==HERE > nc www.debian.org 80 HTTP/1.1 200 OK Date: Fri, 20 Feb 2004 05:04:55 GMT Server: Apache/1.3.26 (Unix) Debian GNU/Linux PHP/4.1.2 DAV/1.0.3 Content-Location: support.en-us.html Vary: negotiate,accept-language TCN: choice Cache-Control: max-age=86400 Expires: Sat, 21 Feb 2004 05:04:55 GMT Last-Modified: Tue, 17 Feb 2004 02:33:35 GMT ETag: "11301e9-3b67-40317d7f;403575e1" Accept-Ranges: bytes Content-Length: 15207 Connection: close Content-Type: text/html Content-Language: en-us If at this point I add ", *" to the Accept-Language header, it will not break things. (However, when I mistakenly used sv_FI and en_US as language tags, the asterisk produced an error, whereas if I sent the same request without the asterisk, I would get a 200 with what is apparently the Mother of All Variants of this page, presumably English: $ sed -e 's/^ //' -e 's/$/ /' <<==HERE | > HEAD /support HTTP/1.1 > Host: www.debian.org > Accept-Language: sv_FI, i_navajo, en_US > Connection: close > > ==HERE > nc www.debian.org 80 | > egrep '^(HTTP|Content-Location)' HTTP/1.1 200 OK Content-Location: support.html If I put back in the * as a catch-all, there's the 404 again: $ sed -e 's/^ //' -e 's/$/ /' <<==HERE | > HEAD /support HTTP/1.1 > Host: www.debian.org > Accept-Language: sv_FI, i_navajo, en_US, * > Connection: close > > ==HERE > nc www.debian.org 80 | > egrep '^(HTTP|Content-Location)' HTTP/1.1 404 Not Found Content-Location: support.nb.html You'll notice that I'm taking the liberty to grep just the interesting parts of the response here to keep this parenthesis shorter.) The language tag "*" is explicitly allowed in RFC2616 section 14.4 to mean any other language not already listed. Granted, passing it in on its own is perhaps dubious. Real-world case: The W3C link validator reports broken links (404s) to many of the important pages on www.debian.org. Here is a discussion: <http://thread.gmane.org/gmane.org.w3c.validator/3520> /* era */ -- System Information Debian Release: 3.0 Kernel Version: Linux there.afraid.org 2.2.20 #1 SMP Thu Nov 7 16:15:53 EET 2002 i586 unknown