Thanks for your quick feedback! """ Industrial implementation of selected sorting algorithm: The industrial version is basically an optimization based on the benchmark implementation. I plan to use optimizations like checking if input array is already sorted or applying insertion sort directly for short arrays to see if the performance can be improved. I am still looking for other common optimization methods.
""" What I am trying to say here is that similar optimizations can be applied to novel algorithms or other implementations of quicksort. The paper about "Dual-Pivot Quicksort" is really helpful and it has a Java implementation already. I can definitely make use of that. Also, I am wondering what's the normal approach to testing if a certain sorting implementation brings performance gain in this project? More specifically, You mentioned that little progress has been made with novel algorithmic enhancement. How can we get that conclusion? I am still working on my proposal and I will post a new version soon:) Regards, Kefan 2018-03-17 18:05 GMT-07:00 Peter Geoghegan <p...@bowt.ie>: > On Sat, Mar 17, 2018 at 5:34 PM, Kefan Yang <starord...@gmail.com> wrote: > > I am Kefan Yang, a third-year Computing Science student from Simon Fraser > > University, Canada. I am very interested in the sorting algorithm > > benchmarking and implementation issue you mentioned on the idealist of > > Google Summer of Code 2018. > > > I've attached a proposal draft to this email. It's a brief one but I > guess > > it's enough to show my current ideas:) > > The proposal reads: > > """ > Industrial implementation of selected sorting algorithm: > The industrial version is basically an optimization based on the benchmark > implementation. I plan to use optimizations like checking if input > array is already sorted > or applying insertion sort directly for short arrays to see if the > performance can be > improved. I am still looking for other common optimization methods. > > """ > > But our qsort implementation, which is the Bentley & McIlroy > implementation with a couple of customizations, already does > everything you mention (I refer to the precheck algorithmic > customization, and the way that we check to see which side of a pivot > to use real recursion with to avoid stack overflows). I suggest that > you read the paper [1] -- the code that we use is almost directly > lifted from the paper. The opaque sounding variable names are the > same, and the C code from the paper is structured in exactly the same > way. > > I think that this won't be a particularly easy project to get > committed. I suggest that if you go forward with it that you > specifically look into integrating "Dual-Pivot Quicksort" [2] as a > whole cloth replacement for the B&M implementation. It seems like this > has some chance of working out, because it's more or less acknowledged > by Bentley himself to be a kind of successor to his own industrial > quicksort implementation [3] -- it's derived from it. Note that Java 7 > uses dual pivot quicksort when sorting integers. > > In general, we've had a lot of success with optimizing sorting in the > past few years by focusing on things like avoiding cache misses in > comparators. There has been much less success with algorithmic > improvements, and no success at all with novel algorithmic > enhancements. PostgreSQL development just isn't a great way to do that > sort of thing. > > BTW, I noticed that you go on to say this: > > """ > However, > since disk operation is much expensive than in-memory sorting, I am > not sure if we can > observe a significant difference in this way. > > """ > > I think that you'd be surprised at how often this isn't true these > days, at least when sorting enough data for spilling to disk to be > relevant. The main reason for that is that the cost of writing out > runs increases linearly, and therefore eventually becomes very small > compared to the costs of sorting itself, which increases at a > linearithmic rate. > > [1] http://cs.fit.edu/~pkc/classes/writing/samples/ > bentley93engineering.pdf > [2] http://codeblab.com/wp-content/uploads/2009/09/DualPivotQuicksort.pdf > [3] http://mail.openjdk.java.net/pipermail/core-libs-dev/2009- > September/002630.html > -- > Peter Geoghegan >