bug#32073: Improvements in Grep

2018-07-06 Thread Sergiu Hlihor
Hello, I'm using grep over Ubuntu Server 14.04 (Grep version 2.16). While grepping over large files I've noticed Grep is painfully slow. The bottleneck seems to be the read block which is extremely low (looks like 64KB). For large files residing over big HDD RAID arrays, this request barely re

bug#32073: Improvements in Grep

2018-07-06 Thread Sergiu Hlihor
an average CPU load of 2-3% of the given machine under such low reads, therefore it can do much more if reading is optimized. On 7 July 2018 at 02:33, Jim Meyering wrote: > On Fri, Jul 6, 2018 at 9:26 AM, Sergiu Hlihor wrote: > > Hello, > > I'm using grep over Ubuntu Server 1

bug#32073: Improvements in Grep (Bug#32073)

2020-01-01 Thread Sergiu Hlihor
ink we should follow Coreutils' lead[0] and increase > > grep's initial buffer size from 32KiB, probably to 128KiB. > > I see that Jim later installed a patch increasing it to 96 KiB. > > Whatever number is chosen, it's "wrong" for some configuration. And I >

bug#32073: Improvements in Grep (Bug#32073)

2020-01-01 Thread Sergiu Hlihor
ds on the st_blksize member in struct stat. That will > typically be something like 4K - not nearly enough, given your description > below. > > Arnold > > Sergiu Hlihor wrote: > > > This topic is getting more and more frustrating. If you rely on OS, then > > you are at t

bug#32073: Improvements in Grep (Bug#32073)

2020-01-01 Thread Sergiu Hlihor
is stupid. On Wed, 1 Jan 2020 at 20:42, Paul Eggert wrote: > On 1/1/20 1:15 AM, Sergiu Hlihor wrote: > > If you rely on OS, then > > you are at the mercy of whatever read ahead configuration you have. > > Right, and whatever changes you make to the OS and its read-ahead >

bug#32073: Improvements in Grep (Bug#32073)

2020-01-01 Thread Sergiu Hlihor
On Thu, 2 Jan 2020 at 01:51, Jim Meyering wrote: > On Wed, Jan 1, 2020 at 12:04 PM Sergiu Hlihor wrote: > > Paul, I have to correct you. On a production server you have usually a > mix of applications many times including databases. For databases, having a > read ahead means

bug#32073: Improvements in Grep (Bug#32073)

2020-01-01 Thread Sergiu Hlihor
now your usage pattern. I already had to finetune the database due to it. On Wed, 1 Jan 2020 at 21:24, wrote: > Hi. > > Sergiu Hlihor wrote: > > > Arnold, there is no need to write user code, it is already done in > > benchmarks. One of the standard benchmarks when testing HDDs

bug#32073: Improvements in Grep (Bug#32073)

2020-01-02 Thread Sergiu Hlihor
developers forget about it. But as I said, large default is very likely enough. On Thu, 2 Jan 2020 at 08:20, wrote: > Hi. > > Sergiu Hlihor wrote: > > > Hi Arnold, > > If AWKBUFSIZE translates to disk IO request size then it is already what > > its needed. However i