On Thu, Sep 18, 2014 at 1:33 AM, Santiago Ruano Rincón <santi...@debian.org> wrote: > El 17/09/14 a las 23:00, Paul Eggert escribió: >> I've installed all the patches mentioned so far. >> > > I've successfully build the latest commit > (f6de00f6cec3831b8f334de7dbd1b59115627457), but I don't see any > performance boost. Rather the opposite. > > Comparing with debian's grep 2.20-3, that includes your first patch to solve > this -P issue, 0001-grep-P-invalid-utf8-non-matching.patch: > > grep -P asdf /usr/bin/* 12,42s user 0,12s system 99% cpu 12,545 total > src/grep -P asdf /usr/bin/* 14,37s user 0,12s system 99% cpu 14,492 total > > Note that basic grep also slowdowns: > > grep asdf /usr/bin/* 0,22s user 0,16s system 99% cpu 0,382 total > src/grep asdf /usr/bin/* 1,26s user 0,12s system 99% cpu 1,384 total
Thank you for running timing comparisons. Once I verified that I had no large, sparse files in my grep working directory, I ran the same test there (du -sh . reports 176M, du --app -sh . reports 139M) The following shows a performance regression when searching files like those in my grep working directory. The new grep (v2.20-46-gf6de00f) takes 2.5x longer than 2.20.14. This is with a hot cache (best of several runs) on a Intel(R) Xeon(R) CPU E5-2660, compiled with gcc-5.x $ diff -u <(env time grep -r asdf . 2>&1) <(PATH=src:$PATH env time grep -r asdf . 2>&1) --- /proc/self/fd/11 2014-09-18 12:07:43.169721947 -0700 +++ /proc/self/fd/12 2014-09-18 12:07:43.169721947 -0700 @@ -1,3 +1,3 @@ ./src/grep.c: printf 'asdfqwerzxcv\rASDF\tZXCV\n' -0.08user 0.10system 0:00.18elapsed 100%CPU (0avgtext+0avgdata 6256maxresident)k -0inputs+0outputs (0major+670minor)pagefaults 0swaps +0.40user 0.11system 0:00.51elapsed 99%CPU (0avgtext+0avgdata 5328maxresident)k +0inputs+0outputs (0major+634minor)pagefaults 0swaps It looks like most of the difference is the result of commit cd36abd46c5e0768606979ea75a51732062f5624, "grep: treat a file as binary if its prefix contains encoding errors", with its new, locale-sensitive "is_binary" test. I saw the above timing difference even with LC_ALL=C, so one quick fix would be to skip the use of mbrlen when possible.