Re: Clean "Durty" strings

Marc 'BlackJack' Rintsch Mon, 02 Apr 2007 10:51:04 -0700

In <[EMAIL PROTECTED]>, irstas wrote:

> I'd like to see how this transformation can be done with
> BeautifulSoup. Well, the last two regexps can be replaced with this:
> 
> unicode(BeautifulStoneSoup(s,convertEntities=BeautifulStoneSoup.HTML_ENTITIES).contents[0])


Completely without regular expressions:

def main():
    soup = BeautifulSoup(source, convertEntities=BeautifulSoup.HTML_ENTITIES)
    print ' '.join(''.join(soup(text=True)).split())

Ciao,
        Marc 'BlackJack' Rintsch
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Clean "Durty" strings

Reply via email to