Brano Gerzo schreef:
> hello all!
>
> I'd like to request for help with this regexp. I want match
> these examples:
>
> word word
> 3 word word
> 3 word word en
> 3 word word en,pt
> 3 word word en,pt 1cd
>
> ok, here is regexp I wrote:
> ^\s*(\d{1,2}\s+)?([\w\s\+:]+)
> (sq|hy|ay|bs|bg|hr|cs|da|nl|en|et|fi|fr|de|gr|he|hu|zh|it|ja|kk|lv|
> pl|pt|pb|ro|ru|sr|sk|sl|es|sv|th|tr|uk|al|\s*,\s*)*\s*(?:(\d)cd)?$
>
> problem is, I can't get "en,pt" together, "en" is mathed to $2.
> Anyone will help me on this, please ?


I don't understand what you try to match with "[\w\s\+:]+". It matches
any series of characters that belong to the character class containing
[[:word:]], [[:space:]], a plus and a colon. So "a b :c" would match.


#!/usr/bin/perl
  use warnings ;
  use strict ;

  sub sp       { '[[:blank:]]+' }
  sub capture  { "(@_)" }
  sub optional { "(?:@_)?" }

  sub REnumber { '\d+' }
  sub REword   { '\w+' }
  sub RElang   { '
(?:
a[ly]|b[gs]|cs|d[ae]|e[nst]|
f[ir]|gr|h[eruy]|it|ja|kk|lv|nl|
p[blt]|r[ou]|s[klqrv]|t[hr]|uk|zh)
' }

  sub REwordlist { REword . sp . REword }
  sub RElanglist { RElang . optional( ',' . RElang ) }

  my $re = optional(capture(REnumber).sp)
         . capture(REwordlist)
         . optional(sp.capture(RElanglist))
         . optional(sp.capture('\d+').'cd') ;

  print "re/$re/\n\n\n" ;

  my $qr = qr/ $re /x ;

  while ( <DATA> )
  {
    no warnings ;
    print "\n" ;
    print ;
    /$qr/ and print "($1) ($2) ($3) ($4)\n" ;
  }

__DATA__
word word
3 word word
3 word word en
3 word word en,pt
3 word word en,pt 1cd

-- 
Affijn, Ruud

"Gewoon is een tijger."



-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<http://learn.perl.org/> <http://learn.perl.org/first-response>


Reply via email to