Re: [PR] Sketch out a Morselize API [datafusion]

via GitHub Tue, 24 Mar 2026 11:38:45 -0700


Dandandan commented on PR #20820:
URL: https://github.com/apache/datafusion/pull/20820#issuecomment-4120554070


   > I have implemented the work stealing scheduler idea and while it seems to 
show promise it still clearly is not ready (given the results above)
   > 
   > <img alt="Screenshot 2026-03-24 at 7 24 53 AM" width="1994" height="890" 
src="https://private-user-images.githubusercontent.com/490673/568323287-12e32d37-ba59-457c-b6de-bde63237a64e.png?jwt=eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3NzQzNzU4NTgsIm5iZiI6MTc3NDM3NTU1OCwicGF0aCI6Ii80OTA2NzMvNTY4MzIzMjg3LTEyZTMyZDM3LWJhNTktNDU3Yy1iNmRlLWJkZTYzMjM3YTY0ZS5wbmc_WC1BbXotQWxnb3JpdGhtPUFXUzQtSE1BQy1TSEEyNTYmWC1BbXotQ3JlZGVudGlhbD1BS0lBVkNPRFlMU0E1M1BRSzRaQSUyRjIwMjYwMzI0JTJGdXMtZWFzdC0xJTJGczMlMkZhd3M0X3JlcXVlc3QmWC1BbXotRGF0ZT0yMDI2MDMyNFQxODA1NThaJlgtQW16LUV4cGlyZXM9MzAwJlgtQW16LVNpZ25hdHVyZT1kNjkzNDRhMDkzZTYyNzUwYWVjODZhMGMwOGRhNjdkNTIwMjYwMjE2MjcyOGI2ZmVlOWFmMjBhYzQwZDBjOWY0JlgtQW16LVNpZ25lZEhlYWRlcnM9aG9zdCJ9.V5JAN8Rux0-K0nN8Owg3LB3-B5EgKyTibsaSIwhXYf4";>
 <img alt="Screenshot 2026-03-24 at 7 26 13 AM" width="2000" height="876" 
src="https://private-user-images.githubusercontent.com/490673/
 
568323289-058c18f1-a470-46af-bd59-b7ed5819e94d.png?jwt=eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3NzQzNzU4NTgsIm5iZiI6MTc3NDM3NTU1OCwicGF0aCI6Ii80OTA2NzMvNTY4MzIzMjg5LTA1OGMxOGYxLWE0NzAtNDZhZi1iZDU5LWI3ZWQ1ODE5ZTk0ZC5wbmc_WC1BbXotQWxnb3JpdGhtPUFXUzQtSE1BQy1TSEEyNTYmWC1BbXotQ3JlZGVudGlhbD1BS0lBVkNPRFlMU0E1M1BRSzRaQSUyRjIwMjYwMzI0JTJGdXMtZWFzdC0xJTJGczMlMkZhd3M0X3JlcXVlc3QmWC1BbXotRGF0ZT0yMDI2MDMyNFQxODA1NThaJlgtQW16LUV4cGlyZXM9MzAwJlgtQW16LVNpZ25hdHVyZT1mN2MxYjBhZGU1YTRhOWRlMGQ0Mzc0MTI5OGEzNTQ0ZTY0YTM3OWEwY2I0MTRmOTBmMDcwNzlhNDg4NGI3NGUzJlgtQW16LVNpZ25lZEhlYWRlcnM9aG9zdCJ9.3TmFKs4SmHZdQspd7VvpUhG0CmypsdxYL8o7-AHQk4w">
 <img alt="Screenshot 2026-03-24 at 7 30 16 AM" width="2000" height="855" 
src="https://private-user-images.githubusercontent.com/490673/568323295-f798182c-a9a7-4671-8dd6-29877646a604.png?jwt=eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1Y
 
nVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3NzQzNzU4NTgsIm5iZiI6MTc3NDM3NTU1OCwicGF0aCI6Ii80OTA2NzMvNTY4MzIzMjk1LWY3OTgxODJjLWE5YTctNDY3MS04ZGQ2LTI5ODc3NjQ2YTYwNC5wbmc_WC1BbXotQWxnb3JpdGhtPUFXUzQtSE1BQy1TSEEyNTYmWC1BbXotQ3JlZGVudGlhbD1BS0lBVkNPRFlMU0E1M1BRSzRaQSUyRjIwMjYwMzI0JTJGdXMtZWFzdC0xJTJGczMlMkZhd3M0X3JlcXVlc3QmWC1BbXotRGF0ZT0yMDI2MDMyNFQxODA1NThaJlgtQW16LUV4cGlyZXM9MzAwJlgtQW16LVNpZ25hdHVyZT02OWY2ZmFiYTUxZjk2ZmM1N2NmMmExM2QyNWQxZjAxYzA5N2VhZjZhYmEyODg5MTNkZWEzNDFlYjk2NjQ0Nzk2JlgtQW16LVNpZ25lZEhlYWRlcnM9aG9zdCJ9.Mqh4NZN0in9a29RF0Z5Eg3cJ_97EA2NcxEVGHEpOgv4">
   > I spent quite a long time messing with Q23 and saw widely varying results. 
I think this is due to the fact that Q23 is very sensitive to the order in 
which the files are processed (e.g. top-k / dynamic filtering)
   > 
   > However, I have also observed some flakiness in running tests and I think 
that is because some plans require a certain partitioning (e.g. to ensure data 
is passed across streams) and so having the FileStream process data across 
multiple partitions in this case causes incorrectness errors.
   > 
   > My plan is to ensure we don't enable work stealing for plans that require 
data not to cross partitions
   
   Looking st the screenshot I can see that DuckDB seems to do the IO on the 
same thread as where the processing happens (if your machine has 16 cores).
   
   DataFusion starts 50+ tokio blocking threads based on the concurrent IO, as 
each call to `spawn_blocking` will spawn a new one if all threads are busy.
   
   In my understanding, this is is not fully ideal:
   
   * each thread will consume some memory
   * (more importantly) IO is probably loaded on a different cpu core than 
where needed.
   
   Probably it is not that big of a deal for Parquet (as decompressing / decode 
Parquet mighy be ~1-2GB/s and memory bandwidth is way higher.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] Sketch out a Morselize API [datafusion]

Reply via email to