New submission from Nikolay Bogoychev: Robotparser doesn't support two quite important optional parameters from the robots.txt file. I have implemented those in the following way: (Robotparser should be initialized in the usual way: rp = robotparser.RobotFileParser() rp.set_url(..) rp.read )
crawl_delay(useragent) - Returns time in seconds that you need to wait for crawling if none is specified, or doesn't apply to this user agent, returns -1 request_rate(useragent) - Returns a list in the form [request,seconds]. if none is specified, or doesn't apply to this user agent, returns -1 ---------- components: Library (Lib) files: robotparser.patch keywords: patch messages: 171711 nosy: XapaJIaMnu priority: normal severity: normal status: open title: robotparser doesn't support request rate and crawl delay parameters type: enhancement versions: Python 2.7 Added file: http://bugs.python.org/file27373/robotparser.patch _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue16099> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com