On Sat, Jun 15, 2019 at 01:19:33AM +0200, Ævar Arnfjörð Bjarmason wrote:

> ...small correction, we currently hard-rely on kwset() for any pattern
> containing a \0 for "git-grep" (these can only by supplied via the -f
> <pattern-from-file> option), this means that any pattern containing a \0
> is implicitly fixed, unless kwset() doesn't like it (-i and non-ASCII),
> what a mess.
> 
> Since we hard depend on REG_STARTEND since 2f8952250a ("regex: add
> regexec_buf() that can work on a non NUL-terminated string", 2016-09-21)
> we should just fix that while we're at it. It's a backwards-incompatible
> change, but I doubt anyone is relying on our undocumented behavior of
> implicitly considering grep patterns with \0 in them always fixed.

That's only for NULs in the haystack, though. I don't think there's a
way to have a NUL in the pattern with regcomp(), since it takes a
NUL-terminated string.

I do agree with you that treating it like a fixed string is somewhat
insane. We're probably better off to die.

In general, your plan to get rid of kwset sounds like a good path. It
would be a slight regression for somebody who is truly feeding a
fixed-string pattern with a NUL in it, on a system without pcre. Right
now that works (via kwset), and if we would start feeding fixed strings
to regcomp() then obviously that won't work. I guess we could go back to
using memmem as a fallback, which is what it looks like we used before
9eceddeec6 (Use kwset in grep, 2011-08-21).

Seems like a code path that would get exercised approximately never,
though.

-Peff

Reply via email to