Em ter., 5 de nov. de 2024 às 01:12, Michael Paquier <mich...@paquier.xyz> escreveu:
> On Tue, Nov 05, 2024 at 04:23:34PM +1300, David Rowley wrote: > > I tried your optimisation in the attached allzeros.c and here are my > results: > > > > # My version > > $ gcc allzeros.c -O2 -o allzeros && for i in {1..3}; do ./allzeros; done > > char: done in 1543600 nanoseconds > > size_t: done in 196300 nanoseconds (7.86347 times faster than char) > > > > # Ranier's optimization > > $ gcc allzeros.c -O2 -D RANIERS_OPTIMIZATION -o allzeros && for i in > > size_t: done in 531700 nanoseconds (3.6545 times faster than char) > > char: done in 1957200 nanoseconds > > I am not seeing numbers as good as yours, but the winner is clear as > well here: > Thanks for testing. > > $ gcc allzeros.c -O2 -o allzeros && for i in {1..3}; do > ./allzeros; done > char: done in 6578995 nanoseconds > size_t: done in 829916 nanoseconds (7.9273 times faster than char) > char: done in 6581465 nanoseconds > size_t: done in 829948 nanoseconds (7.92997 times faster than char) > char: done in 6585748 nanoseconds > size_t: done in 834929 nanoseconds (7.88779 times faster than char) > > $ gcc allzeros.c -O2 -D RANIERS_OPTIMIZATION -o allzeros && for i in > {1..3}; do ./allzeros; > done char: done in 6591803 nanoseconds > size_t: done in 1236102 nanoseconds (5.33273 times faster than char) > char: done in 6606219 nanoseconds > size_t: done in 1235979 nanoseconds (5.34493 times faster than char) > char: done in 6594413 nanoseconds > size_t: done in 1238770 nanoseconds (5.32336 times faster than char) > > I'm surprised to see that assigning aligned_end at these two different > locations has this much effect once the compiler optimizes the > surroundings, but well. > I think that's a plus point for the benefit of not touching the memory if it's not explicitly necessary. best regards, Ranier Vilela