Hi,
There's a regression in both buster and stretch in the last update of lxml when
running under Python 2:
>>> import lxml.html.clean
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python2.7/dist-packages/lxml/html/clean.py", line 73, in
<module>
r'</?[a-zA-Z]+|\son[a-zA-Z]+\s*=', re.ASCII).search
AttributeError: 'module' object has no attribute 'ASCII'
>>>
The fix is [1].
I recently added support to run the tests to lxml (see #976148). When enabling
the test suite, this bug is exposed (tested in stretch, should be similar in
buster):
python2.7 test.py -vv
Traceback (most recent call last):
File "test.py", line 625, in <module>
exitcode = main(sys.argv)
File "test.py", line 562, in main
test_cases = get_test_cases(test_files, cfg, cov=cov)
File "test.py", line 268, in get_test_cases
module = import_module(file, cfg, cov=cov)
File "test.py", line 209, in import_module
mod = __import__(modname)
File "/build/lxml-3.7.1/src/lxml/html/tests/test_clean.py", line 6, in
<module>
from lxml.html.clean import Cleaner, clean_html
File "/build/lxml-3.7.1/src/lxml/html/clean.py", line 73, in <module>
r'</?[a-zA-Z]+|\son[a-zA-Z]+\s*=', re.ASCII).search
AttributeError: 'module' object has no attribute 'ASCII'
And with the patch applied, the tests run, although some of the clean tests are
failing, probably because the last patch didn't backport the test suite changes
(which was not a problem as the tests weren't being run).
Roberto, my changes for stretch are in [3]. Would you like to take a look at
this and finish it (probably backporting the test changes from [2]) or should I?
Moritz, if you want I can look at buster too.
Cheers,
Emilio
[1] https://github.com/lxml/lxml/commit/4cb57362deb23bca0f70f41ab1efa13390fcdbb1
[2] https://github.com/lxml/lxml/commit/a105ab8dc262ec6735977c25c13f0bdfcdec72a7
[3] https://people.debian.org/~pochu/lxml_3.7.1-1+deb9u3.dsc