[issue670664] HTMLParser.py - more robust SCRIPT tag parsing

2011-07-26 Thread Matt Basta
Matt Basta added the comment: The number of problems produced by this bug can be greatly reduced by adding a relatively small check to the parser. Currently,

[issue670664] HTMLParser.py - more robust SCRIPT tag parsing

2011-07-27 Thread Matt Basta
Matt Basta added the comment: > So I think the example is invalid (should escape the <), and that HTMLParser > is not buggy. On the other hand, the HTML5 spec clearly dictates otherwise: http://www.w3.org/TR/html5/syntax.html#cdata-rcdata-restrictions The text in raw text and RCDATA

[issue670664] HTMLParser.py - more robust SCRIPT tag parsing

2011-07-27 Thread Matt Basta
Matt Basta added the comment: > Yes, but we don't claim to support HTML5 yet. There's also no claim in the docs or the source that HTMLParser specifically adheres to HTML4, either. Ideally, the parser should strive for parity with the functionality of major web browsers, as th

[issue670664] HTMLParser.py - more robust SCRIPT tag parsing

2011-08-01 Thread Matt Basta
Matt Basta added the comment: Seeing as everyone seems pretty satisfied with the 2.7 version, I'd be happy to put together a patch for 3 as well. To confirm, though, this fix is NOT going behind the strict parameter, correct? -- ___ Python tr