Re: getting text inside the HTML tag

Stefan Behnel Mon, 16 Jul 2007 01:11:10 -0700

Bruno Desthuilliers wrote:
> [EMAIL PROTECTED] a écrit :
>> On Jul 14, 12:47 pm, Nikola Skoric <[EMAIL PROTECTED]> wrote:
>>> I'm using sgmllib.SGMLParser to parse HTML. I have successfuly parsed
>>> start
>>> tags by implementing start_something method. But, now I have to fetch
>>> the
>>> string inside the start tag and end tag too. I have been reading through
>>> SGMLParser documentation, but just can't figure that out... can somebody
>>> help? :-)
>>>
>>> -- 
>>> "Now the storm has passed over me
>>> I'm left to drift on a dead calm sea
>>> And watch her forever through the cracks in the beams
>>> Nailed across the doorways of the bedrooms of my dreams"
>>
>> Oi! Try Beautiful Soup instead. That seems to be the defacto HTML
>> parser for Python:
> 
> Nope. It's the defacto parser for HTML-like tag soup !-)


Very true. As long as you're dealing with something that looks pretty much
like HTML, I actually don't think you can beat lxml.html (and it's still
getting better every day).

Stefan
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: getting text inside the HTML tag

Reply via email to