On Dec 11, 2013, at 7:34 AM, punit jain <contactpunitj...@gmail.com> wrote:

> Hi,
> 
> I have a requirement where I need to capture phone number from different 
> strings.
> 
> The strings could be :-
> 
> 
> 1. COMP TEL NO 919369721113  for computer science
> 
> 2. For Best Discount reach 092108493, from 6-9
> 
> 3. Your booking Confirmed, 9210833321
> 
> 4. price for free consultation call92504060
> 
> 5. price for free consultation call92504060number
> 
> I created a regex as below :-
> 
> #!/usr/bin/perl
> 
> my $line= shift @ARGV;
> 
> if($line =~ 
> /(?:(?:\D+|\s+)(?:(91\d{10}|0\d{10}|[7-9]\d{9}|0\d{11})|(?:(?:ph|cal)(\d+))))|(?:(?:(91\d{10}|0\d{10}|[7-9]\d{9}|0\d{11})|(?:(?:ph|cal)(\d+)))(?:\D+|\s+))/)
>  {
> print "one = $1";
> 
> 
> 
> }
> 
> It works fine for 1, 2,3 and prints number however for 4 and 5 one I get 
> number in $2 rather than $1 tough I have pipe operator to check it.
> 
> Any clue how to fix this ?

Your first step is to rewrite the regular expression using the extended syntax 
x modifier and add some whitespace:
 
        if($line =~ 
                m{ 
                  (?:
                        (?: \D+ | \s+ )
                        (?:
                          ( 
                                91\d{10} | 
                                0\d{10} |
                                [7-9]\d{9} |
                                0\d{11}
                          ) |
                          (?:
                                (?:
                                  ph |
                                  cal
                                )
                                (\d+)
                          )
                        )
                  ) |
                  (?: 
                        (?:
                          ( 91\d{10} |
                                0\d{10} |
                                [7-9]\d{9} |
                                0\d{11}) |
                          (?: 
                                (?:
                                  ph | 
                                  cal
                                ) 
                                (\d+)
                          )
                        )
                        (?:
                          \D+ |
                          \s+
                        )
                  ) 
                }x 
        ) {

Then maybe you will have some hope of figuring out why it doesn’t work (I 
certainly can’t). 

I suggest you break it up into a series of if-then-else statements:

  if( $line =~ /91\d{10} | \\d{10} | [7-9]\d{9} | 0\d{11} ) {
   $number = $1;
  }elsif( $line =~ (?:ph|cal)\d+ ) {
    $number = $1;
  }elsif( … ) {
  }else{
    print “No match for $line”;
  }

You don’t need to do it all in one regex. Debugging each of those smaller 
regexes will be easier than debugging the whole thing.



--
To unsubscribe, e-mail: beginners-unsubscr...@perl.org
For additional commands, e-mail: beginners-h...@perl.org
http://learn.perl.org/


Reply via email to