(forwarding )to linux-il On Sun, Apr 28, 2013 at 1:56 PM, Meir Guttman <m...@guttman.co.il> wrote: > Dear Gabor and Ido, > > This post to the Il Linux mailing list bounced and wasn't posted since I am > not a member in the list. Please one of you post it so it shows and is > distributed. > > BTW Ido, from your original post I saw that you want to find lines with > (exactly?) three Hebrew characters, so I modified it and it is now: > > #!/usr/bin/env perl -w > # > > use v5.14; > use utf8; > > my $text = 'שלוabv'; > > if ($text =~ /^[\p{HEBREW}]{3}/) { > say "yes"; > } else { > say "no"; > } > > Regards, > Meir > > -----Original Message----- > From: Gabor Szabo [mailto:szab...@gmail.com] > Sent: יום ו 26 אפריל 2013 09:25 > To: linux-il > Cc: Ori Idan; Meir Guttman > Subject: Re: Finding if a line contains Hebrew characters in perl > >>On Thu, Apr 25, 2013 at 6:05 PM, ik <ido...@gmail.com> wrote: >>> try this >>> >>> #!/usr/bin/env perl -w >>> # >>> >>> use v5.14; >>> use utf8; >>> >>> my $text = 'שלוabv'; >>> >>> if ($text =~ /^[\x{5D0}-\x{5ea}]{3}/) { >>> say "yes"; >>> } else { >>> say "no"; >>> } >> >>I'd probably use \p{IsHebrew} or \p{InHebrew} instead of the hexa code. >>Check here: http://perldoc.perl.org/perluniprops.html to learn way more than >>you'd probably want to :) >> >>I also CC-ed Meir Guttman who is *the* Perl Unicode expert. >>He might have something more correct to suggest. >> >>Gabor >> > > Well, first I am by no means a "Unicode Expert", let alone *the* expert. All > I have is some experience. > > Anyway, I did use the \p{HEBREW} instead of the "\x{}" and it returned "yes". > Please note, just {HEBREW} and ALL-CAPS! Here it is: > > #!/usr/bin/env perl -w > # > > use v5.14; > use utf8; > > my $text = 'שלוabv'; > > if ($text =~ /^[\p{HEBREW}]/) { > say "yes"; > } else { > say "no"; > } > > I also used "if ($text =~ /^[ש]/) {...}", simply entering the Hebrew letter > "Shin" directly, and it printed "yes" too, signifying that 'ש' is the first > letter. (My editor, as well as MS Outlook, show, from left to right, first > 'ו', then 'ל', then 'ש' and then "abv".) > > I also tried to use the official Unicode name for 'ש' - \p{HEBREW LETTER SHIN} > see http://www.unicode.org/charts/PDF/U0590.pdf , and evidently it isn't > defined. I got a compile time error: "Can't find Unicode property definition > "HEBREW LETTER SHIN" at...". A bit disappointing! > > Try it out! > > Meir
_______________________________________________ Linux-il mailing list Linux-il@cs.huji.ac.il http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il