On May 11, 2025, at 6:50 AM, Klaus Kusche <klaus.kus...@computerix.info> wrote: > > I regularly backup hundreds of thousands of very small files with tar. > Currently, this results in many very small sequential read requests.
Are these small read requests occurring because the files are small? Or is tar deliberately making small read requests? A few possible experiments: * Use a larger request size when reading file data. 64k or 128k, perhaps? * Try mmap-ing the input files, either relying on the kernel’s read-ahead logic or having a background thread that reads a single byte every 4k or so to prompt page-ins ahead of the main thread. * Compare how `star` performs; it uses a very different buffering architecture which may uncover other possibilities. > (my tar reads and archives up to 2 GB/s when the input files > are GB-sized, including on-the-fly compression). This suggests the real issue may be opening the files rather than reading them. That is, you may be seeing small read requests from the filesystem code (reading directory pages and stat-ing files) rather than from reading the file contents. That’s a very different problem. Cheers, Tim