Hi Weston,

I have updated the pull request based on your feedback. Additionally, we have added ordering to the Python implementation, which requires your review.

Could you please check if these changes resolve the failure?

W dniu 2024-10-03 o 14:04, Kamil Tokarek pisze:
Hello,
I would like to raise the subject of ordering. Currently, it is not possible to assign the Implicit ordering in scan node. Such option has been added in another nodes[0]. This problem is mentioned here [1]. I have started to work on it [2] but I am unsure how to move forward because I did not fine any clear roadmap about ordering in general.

This also affects asof-join node. Since the node relies on ordered data and dataset asserts no implicit ordering, it causes obscure errors in threaded execution [3]. asof-join node also does not sequence the input. I Fixed this node by inserting SerialSequencingQueuein asof-join node[5], and adding implicit ordering in dataset. In general I think asof-join should require implicit ordering (or any kind of ordering) on all inputs. I created pull request with following changes[7]:

1. Assert implicit ordering with ScanNodeOptions.require_sequenced_outputoption enabled. [4]

2. Add SerialSequencingQueuein asof-join node inputs and require implicit ordering on input [5]

3. Modify asof-join node tests to test threaded operation [6]


Could you please review my code? I would appreciate any feedback to help improve it. Thanks in advance for your feedback.

[0] https://github.com/apache/arrow/pull/34137/commits/bcc1692dbeb5693508ea89e961b4eaf91170d71d <https://github.com/apache/arrow/pull/34137/commits/bcc1692dbeb5693508ea89e961b4eaf91170d71d> [1] https://github.com/apache/arrow/issues/34698 <https://github.com/apache/arrow/issues/34698> [2] https://github.com/mroz45/arrow/commits/Ordering/ <https://github.com/mroz45/arrow/commits/Ordering/> [3] https://github.com/apache/arrow/issues/41706 <https://github.com/apache/arrow/issues/41706>

[4] _https://github.com/mroz45/arrow/commit/7a14586b83641d1bfa1b037f3f2377eb6c911f55_ <https://github.com/mroz45/arrow/commit/7a14586b83641d1bfa1b037f3f2377eb6c911f55>

[5] _https://github.com/apache/arrow/pull/44083/commits/c8047bb8e3d8c83f12070507f3cdc43cb6ee6152_ <https://github.com/apache/arrow/pull/44083/commits/c8047bb8e3d8c83f12070507f3cdc43cb6ee6152>

[6] _https://github.com/apache/arrow/pull/44083/commits/59da79331aefda3fc434e74eb1458cd0e195c879_ <https://github.com/apache/arrow/pull/44083/commits/59da79331aefda3fc434e74eb1458cd0e195c879>

[7] _https://github.com/apache/arrow/pull/44083_ <https://github.com/apache/arrow/pull/44083>

Related issue:
_https://github.com/apache/arrow/issues/26818_ <https://github.com/apache/arrow/issues/26818>

Reply via email to