Gunnar Hjalmarsson wrote:
Stanisław T. Findeisen wrote:
Hi how to write regular expressions matching against Unicode (eg.,
UTF-8) strings?
For instance, in my regexp:
qr/^([.<>@ \w])*$/
Decode the UTF-8 encoded strings before applying the regex on them.
$ perl -MEncode -le '
$utf8_encoded = "smörgåsbord";
$s = decode "UTF-8", $utf8_encoded;
print "Match" if $s =~ /^\w+$/;
'
Match
$
Thanks, decode helped with this. But can I ask you one more question?
What assumptions does Perl make regarding input file (i.e., the
program/script file) encoding?
Is it so that string literals in Perl are byte arrays in fact? What you
type is what you get?
STF
=======================================================================
http://eisenbits.homelinux.net/~stf/
OpenPGP: DFD9 0146 3794 9CF6 17EA D63F DBF5 8AA8 3B31 FE8A
=======================================================================
--
To unsubscribe, e-mail: beginners-unsubscr...@perl.org
For additional commands, e-mail: beginners-h...@perl.org
http://learn.perl.org/