Re: python+libxml2+scrapy AttributeError: 'module' object has no attribute 'HTML_PARSE_RECOVER'

Stefan Behnel Sat, 18 Aug 2012 11:00:14 -0700

Dmitry Arsentiev, 15.08.2012 14:49:
> Has anybody already meet the problem like this? -
> AttributeError: 'module' object has no attribute 'HTML_PARSE_RECOVER'
> 
> When I run scrapy, I get
> 
>   File "/usr/local/lib/python2.7/site-packages/scrapy/selector/factories.py",
> line 14, in <module>
>     libxml2.HTML_PARSE_NOERROR + \
> AttributeError: 'module' object has no attribute 'HTML_PARSE_RECOVER'
> 
> 
> When I run
>  python -c 'import libxml2; libxml2.HTML_PARSE_RECOVER'
> 
> I get
> Traceback (most recent call last):
>   File "<string>", line 1, in <module>
> AttributeError: 'module' object has no attribute 'HTML_PARSE_RECOVER'
> 
> How can I cure it?
> 
> Python 2.7
> libxml2-python 2.6.9
> 2.6.11-gentoo-r6


That version of libxml2 is way too old and doesn't support parsing
real-world HTML. IIRC, that started with 2.6.21 and got improved a bit
after that.

Get a 2.8.0 installation, as someone pointed out already.

Stefan


-- 
http://mail.python.org/mailman/listinfo/python-list

Re: python+libxml2+scrapy AttributeError: 'module' object has no attribute 'HTML_PARSE_RECOVER'

Reply via email to