On Tue, 10 Jun 2025, Nathan Bossart wrote:
So, fseeko() starts winning around 4096 bytes. On macOS, the differences
aren't quite as dramatic, but 4096 bytes is the break-even point there,
too. I imagine there's a buffer around that size somewhere...
This doesn't fully explain the results you are seeing, but it does seem to
validate the idea. I'm curious if you see further improvement with even
lower thresholds (e.g., 8KB, 16KB, 32KB).
By the way, I might have set the threshold to 1MB in my program, but
lowering it won't show a difference in my test case, since the lseek()s I
was noticing before the patch were mostly 8-16KB forward. Not sure what is
the defining factor for that. Maybe the compression algorithm, or how wide
the table is?
Dimitris