[issue13281] robotparser.RobotFileParser ignores rules preceeded by a blank line

2011-10-30 Thread Senthil Kumaran
Senthil Kumaran added the comment: I agree with your interpretation of the RFC. The parsing rules do not specify any provision for inclusion of blank lines "within" the records. However, I find that inclusion is no harm either. I checked that with a robots.txt parser (Google webmaster tools)

[issue13281] robotparser.RobotFileParser ignores rules preceeded by a blank line

2011-10-29 Thread Terry J. Reedy
Terry J. Reedy added the comment: Sorry, the visual linebreak depends on font size. It *is* the comma that caused the problem. You missed my question about the current test suite. Senthil, you are the listed expert for urllib, which includes robotparser. Any opinions on what to do?

[issue13281] robotparser.RobotFileParser ignores rules preceeded by a blank line

2011-10-29 Thread Petri Lehtinen
Petri Lehtinen added the comment: > Because of the line break, clicking that link gives "Server error 404". I don't see a line break, but the comma after the link seems to breaks it. Sorry. > The way I read the grammar, 'records' (which start with an agent > line) cannot have blank lines and

[issue13281] robotparser.RobotFileParser ignores rules preceeded by a blank line

2011-10-28 Thread Terry J. Reedy
Terry J. Reedy added the comment: Because of the line break, clicking that link gives "Server error 404". http://www.robotstxt.org/norobots-rfc.txt works (so please pay attention to formatting). The main page is http://www.robotstxt.org/robotstxt.html The way I read the grammar, 'records' (whi

[issue13281] robotparser.RobotFileParser ignores rules preceeded by a blank line

2011-10-27 Thread Ezio Melotti
Changes by Ezio Melotti : -- nosy: +ezio.melotti stage: patch review -> test needed ___ Python tracker ___ ___ Python-bugs-list mailin

[issue13281] robotparser.RobotFileParser ignores rules preceeded by a blank line

2011-10-27 Thread Petri Lehtinen
Petri Lehtinen added the comment: Blank lines are allowed according to the specification at http://www.robotstxt.org/norobots-rfc.txt, section 3.3 Formal Syntax. The issue also seems to exist on 3.2 and 3.3. -- components: +Library (Lib) keywords: +needs review stage: -> patch review

[issue13281] robotparser.RobotFileParser ignores rules preceeded by a blank line

2011-10-27 Thread Petri Lehtinen
Changes by Petri Lehtinen : -- nosy: +petri.lehtinen ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe: http://mai

[issue13281] robotparser.RobotFileParser ignores rules preceeded by a blank line

2011-10-27 Thread Brian Bernstein
New submission from Brian Bernstein : When attempting to parse a robots.txt file which has a blank line between allow/disallow rules, all rules after the blank line are ignored. If a blank line occurs between the user-agent and its rules, all of the rules for that user-agent are ignored. I am