Hi Paul. > Date: Thu, 27 Feb 2014 13:24:53 -0800 > From: Paul Eggert <egg...@cs.ucla.edu> > Organization: UCLA Computer Science Department > To: Aharon Robbins <arn...@skeeve.com>, 16...@debbugs.gnu.org > Subject: Re: bug#16895: [PATCH] grep: fix multiple bugs with bracket > expressions
OK - I tried out that patch (+ the two successors) in gawk and it works fine, even causing a test that failed to now succeed (since it falls back to regex). I've merged and pushed the change. I definitely owe you some beer for this one. :-) > On 02/27/2014 12:31 PM, Aharon Robbins wrote: > > What a mouthful! Is all that really necessary? > > You should have seen it before I trimmed it down; it listed every POSIX > character. I dunno, maybe it could be trimmed, but I was worried about > oddball character sets like the unibyte JIS character set that's like > ASCII but substitutes Yen-sign for '\', and a couple of other > substitutions like that. I figured better safe than sorry. No big deal > of course. Is that done at compile time in those locales, or at run time? What you've put in is a compile time check. I ask out of total ignorance and am wondering how it works. > >> >- /* build character class. */ > >> >+ /* Build character class. POSIX allows character > >> >+ classes to match multicharacter collating elements, > >> >+ but the regex code does not support that, so do not > >> >+ worry about that possibility. */ > > > > I thought GLIBC did support them? > > Source code says no. That is, [[:alpha:]] never matches a > multicharacter collating sequence. [[=a=]] might do so, but [[:alpha:]] > doesn't. (Unless I'm reading the source code wrong, which is possible. > It's not documented either way, as far as I know.) Ah. I misunderstood the context. GLIBC does support [[=a=]] and [[.ch.]], though, right? Thanks! Arnold