On Tue, Dec 8, 2020 at 8:05 AM Kyle Evans <kev...@freebsd.org> wrote: > > Author: kevans > Date: Tue Dec 8 14:05:25 2020 > New Revision: 368439 > URL: https://svnweb.freebsd.org/changeset/base/368439 > > Log: > src.opts.mk: switch to bsdgrep as /usr/bin/grep > > [.. snip ...] > > I have some WIP to make bsdgrep faster, but do not consider it a blocker > when compared to the pros of switching now (aforementioned bugs, licensing). > > [.. snip ...]
I was asked to collect some stats from that patch to speed up bsdgrep; while the patch isn't ready yet, I decided to do a (really really) rough comparison between gnugrep/bsdgrep as well to follow-up on the speed aspect and perhaps provide a baseline. You can view the results of those comparisons (user time(1) output), which felt 'representative enough' of the difference, here: https://people.freebsd.org/~kevans/stable/grep-stats.txt Some notes, to help with interpretation: - This hardware is not great - All runs were doing a recursive grep from the root of a non-active base/head checkout, -I was not specified, in search of instances of the same pattern (but actually literal) - ${grep}-non == ${grep} -r 'closefrom' . - ${grep}-n == ${grep} -nr 'closefrom' . - ${grep}-c8 == ${grep} -rC8 'closefrom' . The sampling was low enough quality that we can probably just discard all of this, but I found the final two comparisons (gnugrep vs. gnugrep -n vs. gnugrep -C8 and bsdgrep vs. bsdgrep -n vs. bsdgrep -C8) interesting enough that I decided to share this despite the quality. Here are the key points that I find interesting: gnugrep sees a pretty significant difference from the baseline to either of the other two modes. This was expected to some extent- both -n and -C8 will imply some level of line tracking when you're taking the chunked search approach, as you need to count lines even in chunks that don't have any matches for -n and you might even need to do the same for -C8. I think the much smaller difference between the gnugrep baseline and -C8 indicates that they probably don't take the simple/slow approach of counting all newlines to determine that you have 8 and where the 8th prior started, but instead wait for a match then start backtracking. The surprising part about the bsdgrep comparison was that there is significant slowdown when we're checking context. There is almost certainly room for improvement there. Thanks, Kyle Evans _______________________________________________ svn-src-all@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"