Re: regular expression, help

MRAB Tue, 27 Jan 2009 09:40:11 -0800

Vincent Davis wrote:

I think there are two parts to this question and I am sure lots I ammissing. I am hoping an example will help meI have a html doc that I am trying to use regular expressions to get avalue out of.
here is an example or the line
<td colspan='2'>Parcel ID: 39-034-15-009 </td>
I want to get the number "39-034-15-009" after "Parcel ID:" The numberwill be different each time but always the same format.I think I can match "Parcel ID:" but not sure how to get the numberafter. "Parcel ID:" only occurs once in the document.
is this how i need to start?
pid = re.compile('Parcel ID: ')

Basically I am completely lost and am not finding examples I find helpful.
I am getting the html using myurl=urllib.urlopen().Can I use RE like thisthenum=pid.match(myurl)
I think the two key things I need to know are
1, how do I get the text after a match?
2, when I use myurl=urllib.urlopen(http://.......). can I use the myurlas the string in a RE, thenum=pid.match(myurl)

Something like:

pid = re.compile(r'Parcel ID: (\d+(?:-\d+)*)')
myurl = urllib.urlopen(url)
text = myurl.read()
myurl.close()
thenum = pid.search(text).group(1)

Although BeautifulSoup is the preferred solution.
--
http://mail.python.org/mailman/listinfo/python-list

Re: regular expression, help

Reply via email to