In article <[EMAIL PROTECTED]>, John Nagle <[EMAIL PROTECTED]> wrote:
> Nikita the Spider wrote: > > > > > Hi John, > > Are you sure you're not confusing your sites? The robots.txt file at > > www.ibm.com contains the double slashed path. The robots.txt file at > > ibm.com is different and contains this which would explain why you > > think all URLs are denied: > > User-agent: * > > Disallow: / > > > Ah, that's it. The problem is that "ibm.com" redirects to > "http://www.ibm.com", but but "ibm.com/robots.txt" does not > redirect. For comparison, try "microsoft.com/robots.txt", > which does redirect. Strange thing for them to do, isn't it? Especially with two such different robots.txt files. -- Philip http://NikitaTheSpider.com/ Whole-site HTML validation, link checking and more -- http://mail.python.org/mailman/listinfo/python-list