Tushar7012 commented on issue #19971:
URL: https://github.com/apache/datafusion/issues/19971#issuecomment-3801309698

   Thanks for the feedback @BlakeOrth. You raise a valid point about the 
[object_store](vscode-file://vscode-app/c:/Users/td334/AppData/Local/Programs/Microsoft%20VS%20Code/resources/app/out/vs/code/electron-browser/workbench/workbench.html)
 machinery being inherently sequential for listing operations.
   
   A few notes:
   
   I accidentally committed some 
[table.rs](vscode-file://vscode-app/c:/Users/td334/AppData/Local/Programs/Microsoft%20VS%20Code/resources/app/out/vs/code/electron-browser/workbench/workbench.html)
 changes to the wrong PR (#19975 - ArrowBytesViewMap optimization). I've now 
reverted those changes there.
   
   Before investing more time on this, I'd like to understand:
   
   Are there specific scenarios where parallelizing 
[list_files_for_scan](vscode-file://vscode-app/c:/Users/td334/AppData/Local/Programs/Microsoft%20VS%20Code/resources/app/out/vs/code/electron-browser/workbench/workbench.html)
 would provide measurable benefits? (e.g., multiple table paths, or when 
statistics collection is the bottleneck rather than listing itself)
   Would it be more valuable to focus on parallelizing the statistics 
collection phase (which uses buffer_unordered) rather than the file listing 
phase?
   I can run some benchmarks on cold query performance to gather actual 
evidence of improvement (or lack thereof). Would that help inform whether this 
work is worth pursuing?
   
   Happy to hold off on this until we have clearer direction, or pivot to focus 
on areas where parallelization would have more impact.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to