sed and gawk use fastmap in regex, but grep does not. By using fastmap, I expect that grep speeds up for patterns as regex is used.
before: $ time -p env LC_ALL=ja_JP.eucjp src/grep '\([a-b]\)\1' k real 7.83 user 7.62 sys 0.07 after: $ time -p env LC_ALL=ja_JP.eucjp src/grep '\([a-b]\)\1' k real 0.46 user 0.38 sys 0.07 However, if grep uses fastmap, fails in case-fold-titlecase test. It means that grep's behavior differ from sed and gawk, as they use fastmap, although it seems to be a bug in regex.
From 1337006597a7d7e14993af14e57d47d6b483fb0d Mon Sep 17 00:00:00 2001 From: Norihiro Tanaka <nori...@kcn.ne.jp> Date: Sun, 17 Jul 2016 01:25:18 +0900 Subject: [PATCH] grep: use fastmap in regex * src/dfasearch.c (GEAcompile): Use fastmap in regex. --- src/dfasearch.c | 3 +++ 1 files changed, 3 insertions(+), 0 deletions(-) diff --git a/src/dfasearch.c b/src/dfasearch.c index 8052ef0..e5223e5 100644 --- a/src/dfasearch.c +++ b/src/dfasearch.c @@ -154,6 +154,9 @@ GEAcompile (char const *pattern, size_t size, reg_syntax_t syntax_bits) patterns = xnrealloc (patterns, pcount + 1, sizeof *patterns); patterns[pcount] = patterns0; + patterns[pcount].regexbuf.fastmap = + = xmalloc ((UCHAR_MAX + 1) * sizeof (char)); + char const *err = re_compile_pattern (p, len, &(patterns[pcount].regexbuf)); if (err) -- 1.7.1