--- Steven Bethard <[EMAIL PROTECTED]> wrote: > Rob Wolfe wrote: > > Steve Howell wrote: > > > >> I suggested earlier that maybe we post multiple > >> solutions. That makes me a little nervous, to > the > >> extent that it shows that the Python community > has a > >> hard time coming to consensus on tools sometimes. > > > > We agree that BeautifulSoup is the best for > parsing HTML. :) > > > >> This is not a completely unfair knock on Python, > >> although I think the reason multiple solutions > tend to > >> emerge for this type of thing is precisely due to > the > >> simplicity and power of the language itself. > >> > >> So I don't know. What about trying to agree on > an XML > >> parsing example instead? > >> > >> Thoughts? > > > > I vote for example with ElementTree (without > xpath) > > with a mention of using ElementSoup for invalid > HTML. > > Sounds good to me. Maybe something like:: > > import xml.etree.ElementTree as etree > dinner_recipe = ''' > <ingredients> > <ing><amt><qty>24</qty><unit>slices</unit></amt><item>baguette</item></ing> > <ing><amt><qty>2+</qty><unit>tbsp</unit></amt><item>olive_oil</item></ing> > <ing><amt><qty>1</qty><unit>cup</unit></amt><item>tomatoes</item></ing> > <ing><amt><qty>1-2</qty><unit>tbsp</unit></amt><item>garlic</item></ing> > <ing><amt><qty>1/2</qty><unit>cup</unit></amt><item>Parmesan</item></ing> > <ing><amt><qty>1</qty><unit>jar</unit></amt><item>pesto</item></ing> > </ingredients>''' > pantry = set(['olive oil', 'pesto']) > tree = etree.fromstring(dinner_recipe) > for item_elem in tree.getiterator('item'): > if item_elem.text not in pantry: > print item_elem.text > > Though I wouldn't know where to put the ElementSoup > link in this one... >
Whatever makes the most sense, please post it. Sorry for not responding earlier. ____________________________________________________________________________________ Building a website is a piece of cake. Yahoo! Small Business gives you all the tools to get online. http://smallbusiness.yahoo.com/webhosting -- http://mail.python.org/mailman/listinfo/python-list