BlakeOrth commented on issue #16365: URL: https://github.com/apache/datafusion/issues/16365#issuecomment-3165993099
I've been investigating performance when using the `ListingTable` with remote storage, and since `datafusion-cli` ultimately uses the `ListingTable` I'm curious if my findings might be interesting and related to this issue. Most of my focus has been on using the `ListingTable` with hive partitioned data. I observed there was a very large discrepancy in the performance between hive partitioned datasets and flat partitioned datasets, even when the number of underlying objects was similar. This was actually noted in an unresolved bug from about a year ago #9654 . I put a few simple targeted timing debug prints in the code and learned that in many cases most of the time spent for remote queries is dominated by listing the files in the backing object store. Due to the current implementation around listing files for hive partitioned data this can often be doubly true for such datasets (the depth of the partitioning structure impacts the number of round-trip latency costs that are paid every query). Here are some examples: ```sql DataFusion CLI v49.0.0 > CREATE EXTERNAL TABLE athena_partitioned STORED AS PARQUET LOCATION 's3://clickhouse-public-datasets/hits_compatible/athena_partitioned/'; 0 row(s) fetched. Elapsed 4.382 seconds. > select count(*) from athena_partitioned; file_list duration: 0.0110909995ms full group files duration: 2028.1296ms +----------+ | count(*) | +----------+ | 99997497 | +----------+ 1 row(s) fetched. Elapsed 2.031 seconds. > select count(*) from athena_partitioned; file_list duration: 0.0058609997ms -- NOTE: this is likely substantially faster due to server side caching of the list request -- A similar pattern can be seen for nearly all of these examples full group files duration: 156.28888ms +----------+ | count(*) | +----------+ | 99997497 | +----------+ 1 row(s) fetched. Elapsed 0.159 seconds. > select count(*) from 's3://clickhouse-public-datasets/hits_compatible/athena_partitioned/'; file_list duration: 0.004357ms full group files duration: 1681.8896ms +----------+ | count(*) | +----------+ | 99997497 | +----------+ 1 row(s) fetched. Elapsed 4.231 seconds. > select count(*) from 's3://clickhouse-public-datasets/hits_compatible/athena_partitioned/'; file_list duration: 0.004645ms full group files duration: 1461.2577ms +----------+ | count(*) | +----------+ | 99997497 | +----------+ 1 row(s) fetched. Elapsed 3.538 seconds. > CREATE EXTERNAL TABLE overture_maps STORED AS PARQUET LOCATION 's3://overturemaps-us-west-2/release/2025-07-23.0/'; 0 row(s) fetched. Elapsed 10.764 seconds. > select count(*) from overture_maps where type='address'; list_partitions_from_paths: listing from release/2025-07-23.0 list_partitions_from_paths: found 512 files list_partitions_from_paths: built 22 partitions Listed 22 partitions in 136.81548ms Pruning yielded 1 partitions in 0.147322ms file_list duration: 136.97224ms full group files duration: 353.54166ms +-----------+ | count(*) | +-----------+ | 446544475 | +-----------+ 1 row(s) fetched. Elapsed 0.360 seconds. > select count(*) from overture_maps where type='address'; list_partitions_from_paths: listing from release/2025-07-23.0 list_partitions_from_paths: found 512 files list_partitions_from_paths: built 22 partitions Listed 22 partitions in 181.16426ms Pruning yielded 1 partitions in 0.092711ms file_list duration: 181.26404ms full group files duration: 181.33081ms +-----------+ | count(*) | +-----------+ | 446544475 | +-----------+ 1 row(s) fetched. Elapsed 0.186 seconds. > select count(*) from overture_maps where type='address' and theme='addresses'; list_partitions_from_paths: listing from release/2025-07-23.0/theme=addresses/type=address list_partitions_from_paths: found 18 files list_partitions_from_paths: built 2 partitions Listed 2 partitions in 126.23396ms Pruning yielded 1 partitions in 0.084972ms file_list duration: 126.32848ms full group files duration: 164.76274ms +-----------+ | count(*) | +-----------+ | 446544475 | +-----------+ 1 row(s) fetched. Elapsed 0.170 seconds. ``` I have a POC that, at least initially, appears to normalize the performance between hive partitioned datasets and flat partitioned datasets and would allow both partitioned and flat datasets a like to leverage the existing `ListFilesCache` for users that create a `ListingTable` manually (such as myself). Would there be interest in looking further at the POC or discussing additional strategies for normalizing and reducing the amount of time spent listing objects? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org