Hi Weston,
I have updated the pull request based on your feedback. Additionally, we
have added ordering to the Python implementation, which requires your
review.
Could you please check if these changes resolve the failure?
W dniu 2024-10-03 o 14:04, Kamil Tokarek pisze:
Hello,
I would like to raise the subject of ordering. Currently, it is not
possible to assign the Implicit ordering in scan node. Such option has
been added in another nodes[0]. This problem is mentioned here [1]. I
have started to work on it [2] but I am unsure how to move forward
because I did not fine any clear roadmap about ordering in general.
This also affects asof-join node. Since the node relies on ordered
data and dataset asserts no implicit ordering, it causes obscure
errors in threaded execution [3]. asof-join node also does not
sequence the input. I Fixed this node by inserting
SerialSequencingQueuein asof-join node[5], and adding implicit
ordering in dataset. In general I think asof-join should require
implicit ordering (or any kind of ordering) on all inputs. I created
pull request with following changes[7]:
1. Assert implicit ordering with
ScanNodeOptions.require_sequenced_outputoption enabled. [4]
2. Add SerialSequencingQueuein asof-join node inputs and require
implicit ordering on input [5]
3. Modify asof-join node tests to test threaded operation [6]
Could you please review my code? I would appreciate any feedback to
help improve it. Thanks in advance for your feedback.
[0]
https://github.com/apache/arrow/pull/34137/commits/bcc1692dbeb5693508ea89e961b4eaf91170d71d
<https://github.com/apache/arrow/pull/34137/commits/bcc1692dbeb5693508ea89e961b4eaf91170d71d>
[1] https://github.com/apache/arrow/issues/34698
<https://github.com/apache/arrow/issues/34698>
[2] https://github.com/mroz45/arrow/commits/Ordering/
<https://github.com/mroz45/arrow/commits/Ordering/>
[3] https://github.com/apache/arrow/issues/41706
<https://github.com/apache/arrow/issues/41706>
[4]
_https://github.com/mroz45/arrow/commit/7a14586b83641d1bfa1b037f3f2377eb6c911f55_
<https://github.com/mroz45/arrow/commit/7a14586b83641d1bfa1b037f3f2377eb6c911f55>
[5]
_https://github.com/apache/arrow/pull/44083/commits/c8047bb8e3d8c83f12070507f3cdc43cb6ee6152_
<https://github.com/apache/arrow/pull/44083/commits/c8047bb8e3d8c83f12070507f3cdc43cb6ee6152>
[6]
_https://github.com/apache/arrow/pull/44083/commits/59da79331aefda3fc434e74eb1458cd0e195c879_
<https://github.com/apache/arrow/pull/44083/commits/59da79331aefda3fc434e74eb1458cd0e195c879>
[7] _https://github.com/apache/arrow/pull/44083_
<https://github.com/apache/arrow/pull/44083>
Related issue:
_https://github.com/apache/arrow/issues/26818_
<https://github.com/apache/arrow/issues/26818>