On 05/21/2012 03:02 PM, Linda Walsh wrote: >> the cat was out of the bag. POSIX 2001 had to continue to allow >> existing implementations, by stating that range expressions in anything >> but the C locale are explicitly undefined. > > --------------------- > > > Explicitly undefined? Or locale dependent?
POSIX explicitly undefined ranges for all but the C locale. _Other standards_, such as Unicode, are free to add range requirements on top of what POSIX requires, but alas, Unicode collation order does NOT currently specify anything about regular expression or glob range matching, so it is out of scope for Unicode to say what [A-Z] expands to. > > I.e. Unicode does specify ordering, so if your locale is set > to UTF-8 character encoding, then it is explicitly defined. This would > seem to be in conflict with unicode -- and any implementation claiming > to be unicode compatible MUST use unicode ordering when the local character > set is defined to be Unicode. Unicode may specify collation ordering, but it does NOT specify regular expression range ordering. > > This doesn't conflict with Posix, as Posix doesn't define an order > for such -- but a different standard, (Unicode) does specify a > standard. So > for those using UTF-8, shouldn't that have made the order randomization > 'moot'? Wishing doesn't make it so. The fact is that regular expression ranges are currently unspecified in all but the C locale; the RRI project is attempting to make it sane across all locales within the scope of GNU programs, but it takes time to write and approve the patches necessary to get to that point. -- Eric Blake ebl...@redhat.com +1-919-301-3266 Libvirt virtualization library http://libvirt.org
signature.asc
Description: OpenPGP digital signature