Hi,
   I've met a problem in match a regular expression in python. Hope
any of you could help me. Here are the details:

   I have many tags like this:
      xxx<a href="http://xxx.xxx.xxx"; xxx>xxx
      xxx<a href="wap://xxx.xxx.xxx" xxx>xxx
      xxx<a href="http://xxx.xxx.xxx"; xxx>xxx
      .....
   And I want to find all the "http://xxx.xxx.xxx"; out, so I do it
like this:
      httpPat = re.compile("(<a )(href=\")(http://.*)(\")")
      result = httpPat.findall(data)
   I use this to observe my output:
      for i in result:
         print i[2]
   Surprisingly I will get some output like this:
      http://xxx.xxx.xxx";>xxx</a>xxx
   In fact it's filtered from this kind of source:
      <a href="http://xxx.xxx.xxx";>xxx</a>xxx"
   But some result are right, I wonder how can I get the all the
answers clean like "http://xxx.xxx.xxx";? Thanks for your help.


Regards,
Johnny

-- 
http://mail.python.org/mailman/listinfo/python-list

Reply via email to