I'm trying to scrape a section of html and don't see why my regexp stopped working this week. The relevant two-line sample section from

http://www.srh.noaa.gov/data/forecasts/SCZ020.php

is:


<td><b>Barometer</b>:</td> <td align="right" nowrap>30.22&quot; (1023.1 mb)</td>


(notice the space before <td align="right" nowrap>)


My Perl segment (that was working until this week) is:


sub barometer {
local $_ = shift;
m{<td><b>Barometer</b>:</td>\n\s<td align="right" nowrap>(.*?)&quot;} || die "No barometer data";
return $1;
}


(the match line is one long line, but it's wrapping here in email.)

now this match *will* find the pressure if I modify it to:
m{<td align="right" nowrap>(.*?)&quot;} || die "No barometer data";

but there are other weather quantities that I also want to grab with a similar html structure, but those don't have the nowrap option, so I need to be able to match over the two lines of html. Prior to now, the \n\s was functioning as expected to allow the search over a linefeed and one single space character -- and I can't see that the structure has changed any to break the search pattern.

Is there something obvious in the html structure that I've missed here? I appreciate any advice you might have.

Thank you,
Clint

--
Clint <[EMAIL PROTECTED]>


-- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]



Reply via email to