Bruno Haible wrote: >> With my proposal, distros/people that use --with-included-regex would >> get understandable semantics + no equivalence classes >> ... >> locale behavior of regex are irremediably >> broken. For example, when you have a collation element, you can match >> it using ranges (e.g. [d-i] matches "ch" in Czech; "ch" collates after >> "h"), and even apply negation (e.g. [^c-h] matches "ch" too). However >> there is no way to anchor your match to the beginning of the collation >> element. So "chci" matches both /[c-h]+ci/ and /[^c-h]+ci/. It is >> beyond repair, and [=e=] is the only part that can be salvaged. > > So, Jim and you appear to agree that equivalence classes [=e=] are a > reasonable feature outside LC_ALL=C. > > What would it take to let distros/people use --with-included-regex and > get understandable semantics for ranges + working equivalence classes? > > I would prefer that to your proposal, because it cannot be seen as a > regression by people who care about equivalence classes. > > Can that be done through gnulib code?
A glibc-independent solution would be great. Then GNU tr's equivalence classes could finally become useful even on non-glibc systems.