On 5/21/10 Fri May 21, 2010 8:42 AM, "Akhthar Parvez K" <akht...@sysadminguide.com> scribbled:
> On Friday 21 May 2010, Akhthar Parvez K wrote: > I am stuck with regex again, this time I really need to *fix* it: > > Code: > > my @data = ( 'Twinkle twinkle little star > How I wonder what you are > Up above the world so high > Like (a) diamond in the sky. > 123 > Twinkle twinkle little star > How I wonder what you are'); > my $rx1 = qr{ little(\D*wonder) }imx; > my $rx2 = qr{ high(.*like) }imx; > my @regex = ($rx1, $rx2); > my $regx = join ("|", @regex); > print "regx: $regx\n"; > my @matches = map { tr/\n//d; /($regx)/g } @data; > print 'array: ', Dumper \...@matches; > > Output: > regx: (?ix-sm: little(\D*wonder) )|(?ix-sm: high(.*like) ) > array: $VAR1 = [ > 'little starHow I wonder', > ' starHow I wonder', > undef, > 'highLike', > undef, > 'Like', > 'little starHow I wonder', > ' starHow I wonder', > undef > ]; > > I'm expecting a result like this: > regx: (?ix-sm: little(\D*wonder) )|(?ix-sm: high(.*like) ) > array: $VAR1 = [ > 'little starHow I wonder', > ' starHow I wonder', > 'highLike', > 'Like', > 'little starHow I wonder', > ' starHow I wonder', > ]; > > I would like to know why these undefs are appearing in between and how can I > get rid of them. I am sure it's due to the way how I am concatenating the > regex (with join) as it works fine if I put just one regex, but how can this > be fixed when I'm using mutiple regex? You are getting undefs because you have alternation (|) between two sub-patterns and capturing parentheses in each sub-pattern. You also have nested parentheses, with a capturing parenthese pair around the whole. Your regular expression is this: /(little(\D*wonder)|high(.*like))/ with 3 sets of parentheses. Perl is returning what matched by each set of () for each match. You have 3 sets of () and 3 matches. Therefore you are getting 9 returned values. Because you have alternation, only sub-pattern is matching, and the pair of () in the non-matched sub-pattern is returning undef. Can you explain what it is you are trying to do? You are probably better off not trying to do it in one go with a regular expression. Find your marker strings (maybe with index) extract your data (with substr) and do whatever processing you need. If you really want to use regular expression, consider using look-ahead tests and a while loop and don't expect to get everything you need in a single pass through your string. -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/