[ https://issues.apache.org/jira/browse/ARROW-4465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17661486#comment-17661486 ]
Rok Mihevc commented on ARROW-4465: ----------------------------------- This issue has been migrated to [issue #21022|https://github.com/apache/arrow/issues/21022] on GitHub. Please see the [migration documentation|https://github.com/apache/arrow/issues/14542] for further details. > [Rust] [DataFusion] Add support for ORDER BY > -------------------------------------------- > > Key: ARROW-4465 > URL: https://issues.apache.org/jira/browse/ARROW-4465 > Project: Apache Arrow > Issue Type: Improvement > Components: Rust, Rust - DataFusion > Reporter: Andy Grove > Priority: Major > > As a user, I would like to be able to specify an ORDER BY clause on my query. > Work involved: > * Add OrderBy to LogicalPlan enum > * Write query planner code to translate SQL AST to OrderBy (SQL parser that > we use already supports parsing ORDER BY) > * Implement SortRelation > My high level thoughts on implementing the SortRelation: > * Create Arrow array of uint32 same size as batch and populate such that > each element contains its own index i.e. array will be 0, 1, 2, 3.... > * Find a Rust crate for sorting that allows us to provide our own comparison > lambda > * Implement the comparison logic (probably can reuse existing execution code > - see filter.rs for how it implements comparison expressions) > * Use index array to store the result of the sort i.e. no need to rewrite > the whole batch, just the index > * Rewrite the batch after the sort has completed > It would also be good to see how Gandiva has implemented this > -- This message was sent by Atlassian Jira (v8.20.10#820010)