New submission from Nikolay Bogoychev:

Robotparser doesn't support two quite important optional parameters from the 
robots.txt file. I have implemented those in the following way:
(Robotparser should be initialized in the usual way:
rp = robotparser.RobotFileParser()
rp.set_url(..)
rp.read
)
crawl_delay(useragent) - Returns time in seconds that you need to wait for 
crawling
if none is specified, or doesn't apply to this user agent, returns -1
request_rate(useragent) - Returns a list in the form [request,seconds].
if none is specified, or doesn't apply to this user agent, returns -1

----------
components: Library (Lib)
files: robotparser.patch
keywords: patch
messages: 171711
nosy: XapaJIaMnu
priority: normal
severity: normal
status: open
title: robotparser doesn't support request rate and crawl delay parameters
type: enhancement
versions: Python 2.7
Added file: http://bugs.python.org/file27373/robotparser.patch

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue16099>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to