Ezio Melotti <ezio.melo...@gmail.com> added the comment:

The attached patch changes the regex to allow non-ascii letters in attribute 
values (using \w with the re.UNICODE flag instead of [a-zA-Z0-9_]).

Using [^>\s] (or even [^> ]) might be OK too, since that's what browsers seem 
to use (e.g. Firefox and Chrome show "テ<ス＀☃ト   -d-fg" as title of '<a href="" 
title=テ<ス＀☃ト   -d-fg href="">foo</a>', including the non-ascii spaces in the 
middle).

----------
keywords: +patch
nosy: +belopolsky
Added file: http://bugs.python.org/file21406/issue7311.diff

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue7311>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to