https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91030
--- Comment #25 from Thomas Koenig <tkoenig at gcc dot gnu.org> --- (In reply to Jerry DeLisle from comment #24) > On a different Ryzen machine: > > $ ./run.sh > 1024 3.2604169845581055 > 2048 2.7804551124572754 > 4096 2.6416599750518799 > 8192 2.5986809730529785 > 16384 2.5525100231170654 > 32768 2.5145640373229980 > 65536 9.2993371486663818 > 131072 9.0313489437103271 Oops. That increase for 65536 might be an L1 cache effect. Note: We are measuring only transfer speed to cache here. Transfer to actual hard disks will be much slower. It is still relevant though, especially since for the usual cycle of repeatedly calculating and writing data. The OS can then sync the data to disc at its leisure while the next calculation is running. So, what would be a good strategy to select a block size?