Re: Help on regular expression match

2005-09-24 Thread John J. Lee
"Johnny Lee" <[EMAIL PROTECTED]> writes: > Fredrik Lundh wrote: [...] > To the HTMLParser, there is another problem (take my code for example): > > import urllib > import formatter > parser = htmllib.HTMLParser(formatter.NullFormatter()) > parser.feed(urllib.urlopen(baseUrl).read()) > parser.clos

Re: Help on regular expression match

2005-09-24 Thread John J. Lee
"Fredrik Lundh" <[EMAIL PROTECTED]> writes: [...] > or, if you're going to parse HTML pages from many different sources, a > real parser: > > from HTMLParser import HTMLParser > > class MyHTMLParser(HTMLParser): > > def handle_starttag(self, tag, attrs): > if tag == "

Re: Help on regular expression match

2005-09-23 Thread Johnny Lee
Fredrik Lundh wrote: > ".*" gives the longest possible match (you can think of it as searching back- > wards from the right end). if you want to search for "everything until a > given > character", searching for "[^x]*x" is often a better choice than ".*x". > > in this case, I suggest using some

Re: Help on regular expression match

2005-09-22 Thread Fredrik Lundh
Johnny Lee wrote: > I've met a problem in match a regular expression in python. Hope > any of you could help me. Here are the details: > > I have many tags like this: > xxxhttp://xxx.xxx.xxx"; xxx>xxx > xx > xxxhttp://xxx.xxx.xxx"; xxx>xxx > . > And I want to find

Help on regular expression match

2005-09-22 Thread Johnny Lee
Hi, I've met a problem in match a regular expression in python. Hope any of you could help me. Here are the details: I have many tags like this: xxxhttp://xxx.xxx.xxx"; xxx>xxx xx xxxhttp://xxx.xxx.xxx"; xxx>xxx . And I want to find all the "http://xxx.xxx.