Ezio Melotti <ezio.melo...@gmail.com> added the comment: The attached patch changes the regex to allow non-ascii letters in attribute values (using \w with the re.UNICODE flag instead of [a-zA-Z0-9_]).
Using [^>\s] (or even [^> ]) might be OK too, since that's what browsers seem to use (e.g. Firefox and Chrome show "テ<ス☃ト -d-fg" as title of '<a href="" title=テ<ス☃ト -d-fg href="">foo</a>', including the non-ascii spaces in the middle). ---------- keywords: +patch nosy: +belopolsky Added file: http://bugs.python.org/file21406/issue7311.diff _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue7311> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com