Stuart Henderson <stu.li...@spacehopper.org> wrote: > On 2021-11-08, Otto Moerbeek <o...@drijf.net> wrote: > > On Sun, Nov 07, 2021 at 08:13:36PM -0600, Luke Small wrote: > > > >> https://bugs.llvm.org/show_bug.cgi?id=50026 > >> > >> I reported it to the llvm people. it is two slightly different quicksort > >> algorithms which perform radically differently. The one which you could > >> assume would take more time, performs MUCH better. > >> > >> I made a custom quicksort algorithm which outperforms qsort by A LOT for > >> sorting an array of around 300 randomly created unsigned characters, which > >> is what I use it for. > >> > >> if you read the report a guy said there's a 10% difference for sorting 3 > >> million characters on freebsd, but there's about 40% performance difference > >> on OpenBSD. maybe it's also how the OpenBSD team modified clang to prevent > >> rop chain stuff or something? I'm using a westmere based intell server. > >> > >> it's the same for clang 11. > >> > >> What's the deal? > > > > 1. Your test is too small. I increased the test size to get less error > > in measurement. I also changed the code to either run quicksort or > > quicksort depending on an argument. > > > > 2. I confirmed your observation using time on amd64, arm64 however > > shows a difference more in line with expectations: > > > > [otto@mc:8]$ time ./a.out A > > quicksort > > time = 0.335777021 > > 0m00.35s real 0m00.35s user 0m00.01s system > > [otto@mc:9]$ time ./a.out B > > quicksort1 > > time = 0.345703033 > > 0m00.36s real 0m00.36s user 0m00.01s system > > > > > > I suspect some non-optimal decision of the optimizer on amd64 for the > > first quicksort. > > > > A next step could be somebody looking at the code produced by the > > compiler. That is not going to be me. > > This is another example of code quite badly affected by fixup-gadgets. > With an increased test size and building separate objects for each of > the two tests and with/without CFLAGS=-fno-fixup-gadgets: > > $ for i in quicksort*; do echo $i; time ./$i; done > > quicksort > 0m02.33s real 0m02.32s user 0m00.00s system > quicksort-no-fixup-gadgets > 0m01.29s real 0m01.27s user 0m00.00s system > quicksort1 > 0m01.06s real 0m00.94s user 0m00.02s system > quicksort1-no-fixup-gadgets > 0m01.08s real 0m00.87s user 0m00.01s system > >
The word everyone is looking for is "tradeoff", meaning this is intentional.