xudong963 commented on PR #18868:
URL: https://github.com/apache/datafusion/pull/18868#issuecomment-3587692277

   > Thank you very much for this PR @xudong963
   > 
   > The idea is very cool, and a nicely done PR; I have several suggestions, 
but I think it is very close.
   > 
   > Suggestions:
   > 
   > 1. Rename the flag to `preserve_order`
   > 2. Make this new flag on TableScan visible in EXPLAIN plans so we have a 
better chance of understanding when it is being applied/not applied
   > 3. Implement an end to end test for the behavior you want to see 
(specifically, a test that shows a query on a parquet file and demonstrates 
that only a single row group is fetched with this flag, and more than one is 
fetched without the flag)
   > 
   > In my mind, 3 is the most important
   
   Thanks for the review @alamb . I addressed some suggestions:
   1. Renamed the flag to `preseerve_order` 
https://github.com/apache/datafusion/pull/18868/commits/ecbfff7768542fc79fc72e94119a06c9610d29e7
   2. Avoided the second recursive call in pushdown limit and maintained a 
state to decide if set `preserve_order` as true. 
https://github.com/apache/datafusion/pull/18868/commits/e919879cbf003df9fe89b58c94dc72af29ea9f72
   3. Extracted the logic that finds fully matched rg as a function: 
https://github.com/apache/datafusion/pull/18868/commits/647618a6c8e73540c82c12dc471370055f1840b2
   4. Added the end-to-end sqllogictests for limit pruning 
https://github.com/apache/datafusion/pull/18868/commits/63cd8783c3b9f7bdf699f94148430c4100541056
   5. Extracted the not simplifier as a new PR: 
https://github.com/apache/datafusion/pull/18970
   
   Things later the PR:
   1. Make a TableScanBuilder or something (as a follow on PR) and deprecate 
TableScan::try_new
   2. Add the `preserve_order` to explain. The reason I plan to do it in 
another PR is that I guess it'll make lots of changes for tests, which will 
expand the PR.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to