> That's worth trying out. It might also then be worth trying to push both 
> unordered values -- the big one up / the small one down. I've seen other 
> implementations do that, but don't remember where, or what it's called.

It is important that we do not do 2 compares two avoid one copy (assignment to 
temporary) as you did in your patch earlier in this thread, cause compares are 
usually pretty costly (also two compares are then inlined, bloating the binary).
Assigning a sort tuple to a temporary translates to pretty simple assembly 
code, so my suggested modification should outperform. It cuts down the cost of 
the inner loop by ca. 40% comparing the assembly. And it avoids having to 
re-read memory on each comparison, as the temporary can be held in registers.

> "Unbounded" means no bounds check on the array. That's not possible in its 
> current form, so I think you misunderstood something.

Sorry for the confusion. I didn't mean unbounded in the "array bound checking" 
sense, but in the "unrestricted number of loops" sense.

> I only remember implementations tracking loop iterations, not swaps. You'd 
> need evidence that this is better.

Well, the idea was to include the presorted check somehow. Stopping after a 
certain number of iterations is surely more safe than stopping after a number 
of swaps, but we would then implicitly also stop our presort check. We could 
change that though: Count loop iterations and on bailout continue with a pure 
presort check, but from the last position of the insertion sort -- not all over 
again -- by comparing against the maximum value recorded during the insertion 
sort. Thoughts?

> An important part not mentioned yet: This might only be worth doing if the 
> previous recursion level performed no swaps during partitioning and the 
> current pivot candidates are ordered.

Agreed.

> It's currently 7, but should really be something like 10. A script that 
> repeats tests for, say, 7 through 18 should show a concave-up shape in the 
> times. The bottom of the bowl should shift to higher values, and that minimum 
> is what should be compared.

Yeah, as alluded to before, it should be closer to 10 nowadays.
In any case it should be changed at least from 7 to 8, cause then we could at 
least discard the additional check for n > 7 in the quicksort code path (see 
/src/include/lib/sort_template.h#L322). Currently we check n < 7 and a few 
lines down we check for n > 7, if we check n < 8 for insertion sort then the 
second check becomes obsolete.

Benjamin Coutu
http://www.zeyos.com


Reply via email to