Re: Ask how to use HTMLParser

h0uk Thu, 07 Jan 2010 20:47:40 -0800

On 8 янв, 08:44, Water Lin <water...@ymail.invalid> wrote:
> I am a new guy to use Python, but I want to parse a html page now. I
> tried to use HTMLParse. Here is my sample code:
> ----------------------
> from HTMLParser import HTMLParser
> from urllib2 import urlopen
>
> class MyParser(HTMLParser):
>     title = ""
>     is_title = ""
>     def __init__(self, url):
>         HTMLParser.__init__(self)
>         req = urlopen(url)
>         self.feed(req.read())
>
>     def handle_starttag(self, tag, attrs):
>         if tag == 'div' and attrs[0][1] == 'articleTitle':
>             print "Found link => %s" % attrs[0][1]
>             self.is_title = 1
>
>     def handle_data(self, data):
>         if self.is_title:
>             print "here"
>             self.title = data
>             print self.title
>             self.is_title = 0
> -----------------------
>
> For the tag
> -------
> <div class="articleTitle">open article title</div>
> -------
>
> I use my code to parse it. I can locate the div tag but I don't know how
> to get the text for the tag which is "open article title" in my example.
>
> How can I get the html content? What's wrong in my handle_data function?
>
> Thanks
>
> Water Lin
>
> --
> Water Lin's notes and pencils:http://en.waterlin.org
> Email: water...@ymail.com


I want to say your code works well
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Ask how to use HTMLParser

Reply via email to