Ezio Melotti <ezio.melo...@gmail.com> added the comment: I think <x><y z=""o"" /></x> should be parser as <x><y z="" /></x>, and the o"" should be ignored. <x><y z="""" /></x> should be parser as <x><y z="" /></x>, and the last two "" should be ignored. This is what Firefox seems to do.
Currently the parser doesn't seem to handle extraneous data in the start tag too well, because the locatestarttagend_tolerant regex looks for (more or less) well-formed attributes. Attached a patch for test_htmlparser with the two examples provided by Kevin. ---------- keywords: +patch nosy: +ezio.melotti stage: -> needs patch Added file: http://bugs.python.org/file23579/issue12629.diff _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue12629> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com