elementtidy, \0 chars and parsing from a string

Steven Bethard Tue, 09 May 2006 13:55:43 -0700

So I see that elementtidy doesn't like strings with \0 characters in them:

 >>> import urllib
 >>> from elementtidy import TidyHTMLTreeBuilder
 >>> url = 'http://news.bbc.co.uk/1/hi/world/europe/492215.stm'
 >>> url_file = urllib.urlopen(url)
 >>> tree = TidyHTMLTreeBuilder.parse(url_file)
Traceback (most recent call last):
   ...
   File "...elementtidy\TidyHTMLTreeBuilder.py", line 90, in close
     stdout, stderr = _elementtidy.fixup(*args)
TypeError: fixup() argument 1 must be string without null bytes, not str


The obvious solution would be to str.replace('\0', '') on the file's 
text, but I'm not sure how to ask elementtidy to parse from a string 
instead of a file-like object.  Do I need to wrap it in a StringIO, or 
is there a better way?

STeVe
-- 
http://mail.python.org/mailman/listinfo/python-list

elementtidy, \0 chars and parsing from a string

Reply via email to