Python's "robots.txt" file parser may be misinterpreting a special case. Given a robots.txt file like this:
User-agent: * Disallow: // Disallow: /account/registration Disallow: /account/mypro Disallow: /account/myint ... the python library "robotparser.RobotFileParser()" considers all pages of the site to be disallowed. Apparently "Disallow: //" is being interpreted as "Disallow: /". Even the home page of the site is locked out. This may be incorrect. This is the robots.txt file for "http://ibm.com". Some IBM operating systems recognize filenames starting with "//" as a special case like a network root, so they may be trying to handle some problem like that. The spec for "robots.txt", at http://www.robotstxt.org/wc/norobots.html says "Disallow: The value of this field specifies a partial URL that is not to be visited. This can be a full path, or a partial path; any URL that starts with this value will not be retrieved." That suggests that "//" should only disallow paths beginning with "//". John Nagle SiteTruth -- http://mail.python.org/mailman/listinfo/python-list