tags 434094 + upstream moreinfo
thanks
> The data processed should be http://www.cguelich.de/
Hi. I've taken over this package and am cleaning up its bugs.
I've tried all the URLs listed on this page, with the attached test
script, and none of them trigger this bug. I'm assuming it's due to
incorrect encoding detection, but it would help to have a test case.
I do see HTMLParseErrors, but those are due to the poor quality of
HTMLParser, used by BeatifulSoup 3.1. 3.0/3.2 doesn't have these issues.
It looks like other people have run into similar issues:
http://groups.google.com/group/beautifulsoup/search?q=concatenate+NoneType
SR
--
Stefano Rivera
http://tumbleweed.org.za/
H: +27 21 465 6908 C: +27 72 419 8559 UCT: x3127
#!/usr/bin/env python
import traceback
import urllib2
import BeautifulSoup
for url in ('http://www.vupp.cz/czvupp/', 'http://www.cguelich.de/',
'http://www.singular-tech.com/', 'http://www.presse-citron.net/',
'http://blogs.bnet.com/business-books/?p=327',
'http://peaceclub.de/2007/11/26/z-grabstein/'):
print url
try:
data = urllib2.urlopen(url).read()
BeautifulSoup.BeautifulSoup(data)
except Exception, e:
traceback.print_exc()
continue