bug#16481: dfa.c and Rational Range Interpretation

Aharon Robbins Sat, 18 Jan 2014 11:40:28 -0800

Hi Paul.

> Thanks for continuing to bird-dog this.


It's either "tenacity" or "stubborness". :-)

> > I do think that gawk's code is the correct thing to be doing for RRI.
>
> I agree, and installed the second patch enclosed below to
> implement this.

Cool!  Hurray!  One more bit that comes into sync.

> This patch also includes some documentation
> changes -- if you have a bit of time to review them I'd
> appreciate it.

It looks ok, but it doesn't really say anything about RRI - grep
does RRI in all locales now, which falls under the umbrella
of POSIXy implementation-defined behavior, but is just fine.
That should be explained.

> Also, I notice that there are a few "#ifdef GREP"s in dfa.c
> Do you happen to know why they're needed?

No idea.  They all seem to be related to case_fold.  I had
not really noticed them, and they must be working fine for me
since I don't define GREP.

What happens if you compile them in and run the grep test suite?

> > Additionally, I recommend that grep's configure check for good RRI
> > support in the system regex routines and switch to the included ones
> > if the system ones don't support it.
>
> Unfortunately that'd break support for equivalence classes
> and multibyte collation symbols on GNU/Linux platforms, so
> it may be a bridge too far.

Gawk has lived without these so far. :-)

> Until we get glibc fixed, I
> think it's OK to live with the situation where [a-z]
> ordinarily has the rational range interpretation, and this
> breaks down only for complicated matches where the DFA
> doesn't suffice; at least it'll work in the usual case.

At least document it somewhere.

Thanks!

Arnold

bug#16481: dfa.c and Rational Range Interpretation

Reply via email to