[ https://issues.apache.org/jira/browse/ARROW-10899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17282030#comment-17282030 ]
Kirill Lykov edited comment on ARROW-10899 at 2/10/21, 10:08 AM: ----------------------------------------------------------------- Right, I will check also spinsort from boost. I would add why spreadsort is not stable – if array is small or integer doesn't fit to boost::uintmax_t, it uses pbsort which is unstable. If everything is fine it is using radix sort, which is what we want. One solution is to implement stable_spinsort which will use something else instead of pbsort. Below is the plot for these sorting algorithms for int64_t (one above is for int32_t). !Screen Shot 2021-02-10 at 10.58.23.png! Looks like stable spinsort is not better than std::stable_sort and this is expected. I believe that if we want to use primarily keys shorter or equal to 64 bits, it makes sense to look into a-la spreadsort implementation. For that, it is possible to reuse some code from boost::sort::spreadsort::details bu[t this is might be|https://github.com/boostorg/sort/blob/develop/include/boost/sort/spreadsort/detail/integer_sort.hpp] a bad idea since it is not part of the public interface. was (Author: klykov): Right, I will check also spinsort from boost > [C++] Investigate radix sort for integer arrays > ----------------------------------------------- > > Key: ARROW-10899 > URL: https://issues.apache.org/jira/browse/ARROW-10899 > Project: Apache Arrow > Issue Type: Wish > Components: C++ > Reporter: Antoine Pitrou > Assignee: Kirill Lykov > Priority: Major > Attachments: Screen Shot 2021-02-09 at 17.48.13.png, Screen Shot > 2021-02-10 at 10.58.23.png > > > For integer arrays with a non-tiny range of values, we currently use a stable > sort. It may be faster to use a radix sort instead. -- This message was sent by Atlassian Jira (v8.3.4#803005)