On Wednesday 04 of May 2011 17:37:41 Ethan Grammatikidis wrote: > On 4 May 2011, at 11:40 am, Balwinder S Dheeman wrote: > > On 04/26/11 12:03, Ethan Grammatikidis wrote: > >> On 24 Apr 2011, at 9:16 am, hiro wrote: > >>> In http://plan9.bell-labs.com/robots.txt you will find: > >>> > >>> User-agent: * > >>> Disallow: / > >> > >> *facepalm* I wondered if this was the case; didn't think to check. > >> Anyone have any idea why this is there? > > > > Very simple, since the webmaster have already allowed some bots and > > disallowed everyone else ;) > > > > You need to read/analyze the whole robots.txt indeed. > > Now I've read it I can't understand why Google can't find anything > under /wiki. Even if it did, that robots.txt isn't all that pleasant, > blindly disallowing everyone who isn't google or msn, more or less. O.o
I believe we need an ``Allow: /'' below the long list of `Disallows' in the User-agent: Googlebot, User-agent: msnbot section. Otherwise, only the final ``Disallow: /'' matches, and in effecet, every robot is cut off. Or, better, just let any robots crawl the site. Web isn't only about google and msn anymore ;-) (*cough* http://duckduckgo.com/ *cough*) -- dexen deVries [[[↓][→]]] ``In other news, STFU and hack.'' mahmud, in response to Erann Gat's ``How I lost my faith in Lisp'' http://news.ycombinator.com/item?id=2308816