Matt Basta <bastaw...@gmail.com> added the comment: > So I think the example is invalid (should escape the <), and that HTMLParser > is not buggy.
On the other hand, the HTML5 spec clearly dictates otherwise: http://www.w3.org/TR/html5/syntax.html#cdata-rcdata-restrictions The text in raw text and RCDATA elements must not contain any occurrences of the string "</" (U+003C LESS-THAN SIGN, U+002F SOLIDUS) followed by characters that case-insensitively match the tag name of the element followed by one of U+0009 CHARACTER TABULATION (tab), U+000A LINE FEED (LF), U+000C FORM FEED (FF), U+000D CARRIAGE RETURN (CR), U+0020 SPACE, U+003E GREATER-THAN SIGN (>), or U+002F SOLIDUS (/). Additionally, no browsers (perhaps unless they are in quirks mode) currently obey the HTML4 variant of the rule. This is due largely in part to the need to include strings such as "</scr" + "ipt>" within a script tag itself. This behavior can be observed firsthand by loading this snippet in a browser: <script><span></span>This should not be visible.</script> ---------- _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue670664> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com