I've done a google search on this but, amazingly, I'm the first guy to 
ever need this!  Everyone else seems to need the reverse of this.  Actually, 
I did find some people who complained about this and rolled their own 
solution but I refuse to believe that Python doesn't have a built-in 
solution to what must be a very common problem.
    So, how do I convert HTML to plaintext?  Something like this:


<div>This&nbsp;is&nbsp;a&nbsp;string.</div>


    ...into:


This is a string.


    Actually, the ideal would be a function that takes an HTML string and 
convert it into a string that the HTML would correspond to.  For instance, 
converting:


<div>This &amp;    that
or the other thing.</div>


    ...into:


This & that or the other thing.


    ...since HTML seems to convert any amount and type of whitespace into a 
single space (a bizarre design choice if I've ever seen one).
    Surely, Python can already do this, right?
    Thank you... 


-- 
http://mail.python.org/mailman/listinfo/python-list

Reply via email to