[ 
https://issues.apache.org/jira/browse/ARROW-10899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17282030#comment-17282030
 ] 

Kirill Lykov edited comment on ARROW-10899 at 2/10/21, 10:08 AM:
-----------------------------------------------------------------

Right, I will check also spinsort from boost. I would add why spreadsort is not 
stable – if array is small or integer doesn't fit to boost::uintmax_t, it uses 
pbsort which is unstable. If everything is fine it is using radix sort, which 
is what we want.
 One solution is to implement stable_spinsort which will use something else 
instead of pbsort. 

Below is the plot for these sorting algorithms for int64_t (one above is for 
int32_t).

!Screen Shot 2021-02-10 at 10.58.23.png!

Looks like stable spinsort is not better than std::stable_sort and this is 
expected. 
 I believe that if we want to use primarily keys shorter or equal to 64 bits, 
it makes sense to look into a-la spreadsort implementation. For that, it is 
possible to reuse some code from boost::sort::spreadsort::details bu[t this is 
might 
be|https://github.com/boostorg/sort/blob/develop/include/boost/sort/spreadsort/detail/integer_sort.hpp]
 a bad idea since it is not part of the public interface.


was (Author: klykov):
Right, I will check also spinsort from boost

> [C++] Investigate radix sort for integer arrays
> -----------------------------------------------
>
>                 Key: ARROW-10899
>                 URL: https://issues.apache.org/jira/browse/ARROW-10899
>             Project: Apache Arrow
>          Issue Type: Wish
>          Components: C++
>            Reporter: Antoine Pitrou
>            Assignee: Kirill Lykov
>            Priority: Major
>         Attachments: Screen Shot 2021-02-09 at 17.48.13.png, Screen Shot 
> 2021-02-10 at 10.58.23.png
>
>
> For integer arrays with a non-tiny range of values, we currently use a stable 
> sort. It may be faster to use a radix sort instead.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to