On 4/16/20 7:41 AM, Dominik Csapak wrote: > wanted to chime in here (since i believe the original comment > about unicode literals came from me) > > On 4/15/20 5:02 PM, Thomas Lamprecht wrote: >> albeit I'm not to sure if the unicode stuff is really true, I tried: >> perl -we '"Ⅲ" =~ /\d/ or die "no match"' >> >> where "Ⅲ" is the roman numeral three (U+2162) and that does not matches, >> but it could be wrong encoding or whatever... > > if you use utf symbols in the source, it only works if you > have 'use utf8;' > > also the roman numeral three is not considered (afaik) as > part of the 'digits' group (only those that also match \p{Digit} [0]) > but '৪' for example (bengali digit four (U+09EA)) > > if the source comes from outside, you have to have a 'perl string' > with the correct code point, > which happens e.g. when using decode_utf8 on a string with > such a character > > so this example matches: > > ---8<--- > use Encode; > my $val = decode_utf8(shift); > if ($val =~ /(\d+/)) { > print "found $1\n"; > } > --->8---- > > when calling with > perl foo.pl ৪ > > so to avoid confusion (and probably errors), i proposed always using [0-9] > instead of \d > > 0: https://perldoc.perl.org/perlrecharclass.html
Thanks for clearing that up, I mostly poked at it as Dominic came to me asking what to do as you told him to use [0-9] and Fabian told him to go back again, not knowing about your suggestion. The question is, does the matching of other digit really matters at all? I mean it naturally depends on what happens whit the parsed data, but most of the time it should not really matter at all, or? _______________________________________________ pve-devel mailing list pve-devel@pve.proxmox.com https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel