Branch: refs/heads/blead
Home: https://github.com/Perl/perl5
Commit: a091427ee8a210221dc488b4d263632edc72e29c
https://github.com/Perl/perl5/commit/a091427ee8a210221dc488b4d263632edc72e29c
Author: Karl Williamson <[email protected]>
Date: 2024-09-02 (Mon, 02 Sep 2024)
Changed paths:
M regexec.c
M t/re/script_run.t
Log Message:
-----------
Fix \d script run with unusual Unicode data layout
This fixes GH #22535
Unicode guarantees that \d code points occur in groups of 10 consecutive
ones, with the lowest having a numeric value of 0 and the highest having
a value of 9.
A script run in a regular expression pattern matches only characters in
a single script. Further, if more than a single digit is matched, all
must come from the same group of 10 consecutive code points.
The 'Common' script has many such groups, not just 0-9. Perl's
implementation assumed that all groups were isolated from each other in
the Unicode ordering of code points. This is true in all but one case
where there are 5 groups which adjoin each other. This commit changes
the implementation to be cognizant of this possibility.
To unsubscribe from these emails, change your notification settings at
https://github.com/Perl/perl5/settings/notifications