Hi Paul. > Thanks for continuing to bird-dog this.
It's either "tenacity" or "stubborness". :-) > > I do think that gawk's code is the correct thing to be doing for RRI. > > I agree, and installed the second patch enclosed below to > implement this. Cool! Hurray! One more bit that comes into sync. > This patch also includes some documentation > changes -- if you have a bit of time to review them I'd > appreciate it. It looks ok, but it doesn't really say anything about RRI - grep does RRI in all locales now, which falls under the umbrella of POSIXy implementation-defined behavior, but is just fine. That should be explained. > Also, I notice that there are a few "#ifdef GREP"s in dfa.c > Do you happen to know why they're needed? No idea. They all seem to be related to case_fold. I had not really noticed them, and they must be working fine for me since I don't define GREP. What happens if you compile them in and run the grep test suite? > > Additionally, I recommend that grep's configure check for good RRI > > support in the system regex routines and switch to the included ones > > if the system ones don't support it. > > Unfortunately that'd break support for equivalence classes > and multibyte collation symbols on GNU/Linux platforms, so > it may be a bridge too far. Gawk has lived without these so far. :-) > Until we get glibc fixed, I > think it's OK to live with the situation where [a-z] > ordinarily has the rational range interpretation, and this > breaks down only for complicated matches where the DFA > doesn't suffice; at least it'll work in the usual case. At least document it somewhere. Thanks! Arnold