"Johnny Lee" <[EMAIL PROTECTED]> writes:
> Fredrik Lundh wrote:
[...]
> To the HTMLParser, there is another problem (take my code for example):
>
> import urllib
> import formatter
> parser = htmllib.HTMLParser(formatter.NullFormatter())
> parser.feed(urllib.urlopen(baseUrl).read())
> parser.clos
"Fredrik Lundh" <[EMAIL PROTECTED]> writes:
[...]
> or, if you're going to parse HTML pages from many different sources, a
> real parser:
>
> from HTMLParser import HTMLParser
>
> class MyHTMLParser(HTMLParser):
>
> def handle_starttag(self, tag, attrs):
> if tag == "
Fredrik Lundh wrote:
> ".*" gives the longest possible match (you can think of it as searching back-
> wards from the right end). if you want to search for "everything until a
> given
> character", searching for "[^x]*x" is often a better choice than ".*x".
>
> in this case, I suggest using some
Johnny Lee wrote:
> I've met a problem in match a regular expression in python. Hope
> any of you could help me. Here are the details:
>
> I have many tags like this:
> xxxhttp://xxx.xxx.xxx"; xxx>xxx
> xx
> xxxhttp://xxx.xxx.xxx"; xxx>xxx
> .
> And I want to find
Hi,
I've met a problem in match a regular expression in python. Hope
any of you could help me. Here are the details:
I have many tags like this:
xxxhttp://xxx.xxx.xxx"; xxx>xxx
xx
xxxhttp://xxx.xxx.xxx"; xxx>xxx
.
And I want to find all the "http://xxx.xxx.