New submission from Oudin <ou...@crans.org>: When processing an ill-formed robots.txt file (like https://tiny.tobast.fr/robots-file ), the RobotFileParser.parse method does not instantiate the entries or the default_entry attributes.
In my opinion, the method should raise an exception when no valid User-agent entry (or if there exists an invalid User-agent entry) is found in the robots.txt file. Otherwise, the only method available is to check the None-liness of default_entry, which is not documented in the documentation (https://docs.python.org/dev/library/urllib.robotparser.html). According to your opinion on this, I can implement what is necessary and create a PR on Github. ---------- components: Library (Lib) messages: 312711 nosy: Guinness priority: normal severity: normal status: open title: RobotFileParser.parse() should raise an exception when the robots.txt file is invalid type: behavior versions: Python 3.6, Python 3.7, Python 3.8 _______________________________________ Python tracker <rep...@bugs.python.org> <https://bugs.python.org/issue32936> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com