2007/7/12, Andre Engels <[EMAIL PROTECTED]>:

I forgot to include

import urllib2, re

here

> def textonly(url):
>    # Get the HTML source on url and give only the main text
>    f = urllib2.urlopen(url)
>    text = f.read()
>    r = re.compile('\<[^\<\>]*\>')
>    newtext = r.sub('',text)
>    while newtext != text:
>       text = newtext
>       newtext = r.sub('',text)
>    return text


-- 
Andre Engels, [EMAIL PROTECTED]
ICQ: 6260644  --  Skype: a_engels
-- 
http://mail.python.org/mailman/listinfo/python-list

Reply via email to