"Tempo" <[EMAIL PROTECTED]> writes: > In my last post I received some advice to use urllib.read() to get a > whole html page as a string, which will then allow me to use > BeautifulSoup to do what I want with the string. But when I was > researching the 'urllib' module I couldn't find anything about its > sub-section '.read()' ? Is that the right module to get a html page > into a string? Or am I completely missing something here? I'll take > this as the more likely of the two cases. Thanks for any and all help.
Here's a short example of how this all works: #!/usr/bin/env python import urllib2 from BeautifulSoup import BeautifulSoup response = urllib2.urlopen('http://www.cnn.com') soup = BeautifulSoup(response) print soup.prettify() It's not a particularly useful example, unless, of course, you wish to prettify cnn's html, but it should get you to the point where BeautifulSoup's documentation starts to make sense. Jason -- http://mail.python.org/mailman/listinfo/python-list