html 2 plain text

robin Sun, 28 May 2006 11:21:39 -0700

hi,
i remember seeing this simple python function which would take raw html
and output the content (body?) of the page as plain text (no <..> tags
etc)
i have been looking at htmllib and htmlparser but this all seems to
complicated for what i'm looking for. i just need the main text in the
body of some arbitrary webbpage to then do some natural-language
processing with it...
thanks for pointing me to some helpful resources!


robin

-- 
http://mail.python.org/mailman/listinfo/python-list

html 2 plain text

Reply via email to