On Wed, Aug 18, 2010, Dimitry Andric wrote: > On 2010-08-18 22:48, m...@freebsd.org wrote: > >> - Refactor file reading code to use pure syscalls and an internal buffer > >> instead of stdio. This gives BSD grep a very big performance boost, > >> its speed is now almost comparable to GNU grep. > > > > I didn't read all of the details in the profiling mails in the thread, > > but does this mean that work on stdio would give a performance boost > > to many apps? Or is there something specific about how grep(1) is > > using its input that makes it a horse of a different color? > > Originally, it was reading files 1 character at a time, using fgetc(3), > the locking version even. This is usually not the fastest way to read > a large file with stdio. :) > > If grep did not have to support .gz or .bz2 files, we could just have > plugged in stdio's fgetln(3). I tried this approach first on some > non-compressed files, and it performed much better than fgetc'ing. > > The reading code that was now committed, is basically the same algorithm > as fgetln() uses internally, but it can handle gzip and bzip2 input too.
The gzip limitations you refer to could perhaps be worked around with a simple application of funopen(3). IIRC, the overhead inherent in using fgetln(3) or getline(3) on reasonably long lines is very small; if it's not, we should look at ways to improve stdio. There's still a locking operation and memcpy() that can't really be avoided with stdio, though. With getline(), you'd be able to delete most of file.c, but it would never be quite as fast. _______________________________________________ svn-src-head@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"