BlakeOrth opened a new pull request, #18103:
URL: https://github.com/apache/datafusion/pull/18103
## Which issue does this PR close?
This does not fully close, but is an incremental building block component
for:
- https://github.com/apache/datafusion/issues/17207
The full context of how this code is likely to progress can be seen in the
POC for this effort:
- https://github.com/apache/datafusion/pull/17266
## Rationale for this change
Continued progress filling out the methods that are instrumented for the
instrumented object store.
## What changes are included in this PR?
- Adds instrumentation around basic list operations into the instrumented
object store
- Adds test cases for new code
## Are these changes tested?
Yes.
Example output:
```sql
DataFusion CLI v50.2.0
> \object_store_profiling trace
ObjectStore Profile mode set to Trace
> CREATE EXTERNAL TABLE nyc_taxi_rides
STORED AS PARQUET
LOCATION
's3://altinity-clickhouse-data/nyc_taxi_rides/data/tripdata_parquet';
0 row(s) fetched.
Elapsed 2.679 seconds.
Object Store Profiling
Instrumented Object Store: instrument_mode: Trace, inner:
AmazonS3(altinity-clickhouse-data)
2025-10-16T18:53:09.512970085+00:00 operation=List
path=nyc_taxi_rides/data/tripdata_parquet
Summaries:
List
count: 1
Instrumented Object Store: instrument_mode: Trace, inner:
AmazonS3(altinity-clickhouse-data)
2025-10-16T18:53:09.929709943+00:00 operation=List
path=nyc_taxi_rides/data/tripdata_parquet
2025-10-16T18:53:10.106757629+00:00 operation=List
path=nyc_taxi_rides/data/tripdata_parquet
2025-10-16T18:53:10.220555058+00:00 operation=Get duration=0.230604s size=8
range: bytes=222192975-222192982
path=nyc_taxi_rides/data/tripdata_parquet/data-200901.parquet
2025-10-16T18:53:10.226399832+00:00 operation=Get duration=0.263826s size=8
range: bytes=233123927-233123934
path=nyc_taxi_rides/data/tripdata_parquet/data-201104.parquet
2025-10-16T18:53:10.226194195+00:00 operation=Get duration=0.269754s size=8
range: bytes=252843253-252843260
path=nyc_taxi_rides/data/tripdata_parquet/data-201103.parquet
. . .
2025-10-16T18:53:11.928787014+00:00 operation=Get duration=0.072248s
size=18278 range: bytes=201384109-201402386
path=nyc_taxi_rides/data/tripdata_parquet/data-201509.parquet
2025-10-16T18:53:11.933475464+00:00 operation=Get duration=0.068880s
size=17175 range: bytes=195411804-195428978
path=nyc_taxi_rides/data/tripdata_parquet/data-201601.parquet
2025-10-16T18:53:11.949629591+00:00 operation=Get duration=0.065645s
size=19872 range: bytes=214807880-214827751
path=nyc_taxi_rides/data/tripdata_parquet/data-201603.parquet
Summaries:
List
count: 2
Get
count: 288
duration min: 0.060930s
duration max: 0.444601s
duration avg: 0.133339s
size min: 8 B
size max: 44247 B
size avg: 18870 B
size sum: 5434702 B
>
```
## Are there any user-facing changes?
No-ish
##
cc @alamb
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]