New submission from larsfuse <l...@cl.no>:
The standard (http://www.robotstxt.org/robotstxt.html) says: > To allow all robots complete access: > User-agent: * > Disallow: > (or just create an empty "/robots.txt" file, or don't use one at all) Here I give python an empty file: $ curl http://10.223.68.186/robots.txt $ Code: rp = robotparser.RobotFileParser() print (robotsurl) rp.set_url(robotsurl) rp.read() print( "fetch /", rp.can_fetch(useragent = "*", url = "/")) print( "fetch /admin", rp.can_fetch(useragent = "*", url = "/admin")) Result: $ ./test.py http://10.223.68.186/robots.txt ('fetch /', False) ('fetch /admin', False) And the result is, robotparser thinks the site is blocked. ---------- components: Library (Lib) messages: 331595 nosy: larsfuse priority: normal severity: normal status: open title: robotparser reads empty robots.txt file as "all denied" type: behavior versions: Python 2.7 _______________________________________ Python tracker <rep...@bugs.python.org> <https://bugs.python.org/issue35457> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com