Bruno Desthuilliers wrote: > [EMAIL PROTECTED] a écrit : >> On Jul 14, 12:47 pm, Nikola Skoric <[EMAIL PROTECTED]> wrote: >>> I'm using sgmllib.SGMLParser to parse HTML. I have successfuly parsed >>> start >>> tags by implementing start_something method. But, now I have to fetch >>> the >>> string inside the start tag and end tag too. I have been reading through >>> SGMLParser documentation, but just can't figure that out... can somebody >>> help? :-) >>> >>> -- >>> "Now the storm has passed over me >>> I'm left to drift on a dead calm sea >>> And watch her forever through the cracks in the beams >>> Nailed across the doorways of the bedrooms of my dreams" >> >> Oi! Try Beautiful Soup instead. That seems to be the defacto HTML >> parser for Python: > > Nope. It's the defacto parser for HTML-like tag soup !-)
Very true. As long as you're dealing with something that looks pretty much like HTML, I actually don't think you can beat lxml.html (and it's still getting better every day). Stefan -- http://mail.python.org/mailman/listinfo/python-list