Victor Subervi wrote:
On Wed, Jan 6, 2010 at 1:27 PM, Tim Chase <python.l...@tim.thechases.com>wrote:
But if you're using it on HTML form text, regexps are usually the wrong
tool, and you should be using an HTML parser (such as BeautifulSoup) that
knows how to handle odd text and escapings better and more robustly than
regexps will
I have an automatically generated HTML form from which I need to extract
data to the script which this form calls (to which the information is sent).
I believe BeautifulSoup is geared to scraping pages that exist permanently
on the web. By the time BeautifulSoup was called, this page would be gone.
BeautifulSoup takes string data fed to it, and builds a structure
that can be neatly navigated. That string data can come from a
web page, from a disk, or even a serial port, a
random-character-generator, or just from HTML that's built up in
memory and never sees a network or a disk. It's worth reading
its documentation[1] and trying its examples to get familiar with it.
-tkc
[1]
http://www.crummy.com/software/BeautifulSoup/documentation.html
--
http://mail.python.org/mailman/listinfo/python-list