[issue6325] robotparser doesn't handle URL's with query strings

2010-07-28 Thread Senthil Kumaran
Senthil Kumaran added the comment: I modified the patch slightly (so that it takes care of path, query, params and fragments). Fixed in r83209,r83210 and r83211. I also think that we need to move the robotparser to allow regexs in the allow and disallow patterns. ( Shall open an issue in the

[issue6325] robotparser doesn't handle URL's with query strings

2010-07-26 Thread Michael Stephens
Michael Stephens added the comment: Supplied patch matches rules with query params. -- keywords: +patch nosy: +mikejs Added file: http://bugs.python.org/file18218/6325.diff ___ Python tracker __

[issue6325] robotparser doesn't handle URL's with query strings

2010-07-10 Thread Mark Lawrence
Changes by Mark Lawrence : -- assignee: -> orsenthil nosy: +orsenthil versions: +Python 3.1, Python 3.2 -Python 2.4, Python 2.5, Python 2.6 ___ Python tracker ___ ___

[issue6325] robotparser doesn't handle URL's with query strings

2009-06-22 Thread Brian Slesinsky
New submission from Brian Slesinsky : If a robots.txt file contains a rule of the form: Disallow: /some/path?name=value This pattern will never match a URL passed to can_fetch(), as far as I can tell. It's arguable whether this is a bug. The 1994 robots.txt protocol is silent on whether to t