On 10/17/07, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote:
> <td align="right" class="textfield" colspan="2">
>           <span class="sequence-star">
> <span class="titolo-hotel">
> 1  per 1 guest
> <br/>
> <br/>
> Start date Monday, March 12 2007
> <br/>
> End date Tuesday, March 13 2007
> <br/>
>
> My html page looks like as given above.         And my RE to
> capture the dates are given below. Kindly suggest me the
> beat way to do that?
snip

Given that this is HTML you are in deep trouble.  There is no
guarantee that that all of the elements of your match will be on the
same line, or that they won't use HTML entities instead of the
character you are expecting, or any number of other things that can go
wrong when trying to parse HTML by hand.  Not only that, but there
appears to be no semantic markup to help you.  Your best bet is to go
for an eighty percent solution (it will fail on a large portion of
possible cases, but should work for the common cases).  "Start date"
and "End date" appear to be static.  This means you can use them to
anchor your search.  This means the start date pattern should have
pseudo code that looks like this

the constant "Start date"
space
any weekday
comma
space
start capture 1
any month
space
one or two digits FIXME: can match invalid dates
space
four digits FIXME: does not work prior to the year 1000 or after year 9999
end capture 1

which can be implemented like this

my @weekdays = join '|', qw<Sunday Monday ... Saturday>;
my @months = join '|', qw<January February ... December>;

/Start date (?:@weekdays), ((?:@months) \d{2} \d{4})/;

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/


Reply via email to