alamb opened a new issue, #16302: URL: https://github.com/apache/datafusion/issues/16302
### Is your feature request related to a problem or challenge? - part of https://github.com/apache/datafusion/issues/13456 - related to https://github.com/apache/datafusion/issues/16299 I would like to make querying files from remote stores to be easy and a great experience in DataFusion, and `datafusion-cli` in particular. While testing https://github.com/apache/datafusion/pull/16300, I tried this command: ```shell datafusion-cli ``` ```sql > CREATE EXTERNAL TABLE nyc_taxi_rides STORED AS PARQUET LOCATION 's3://altinity-clickhouse-data/nyc_taxi_rides/data/tripdata_parquet'; Object Store error: Object at location nyc_taxi_rides/data/tripdata_parquet not found: Error performing HEAD https://s3.us-east-1.amazonaws.com/altinity-clickhouse-data/nyc_taxi_rides/data/tripdata_parquet in 142.679833ms - Server returned non-2xx status code: 404 Not Found: ``` This confused me for quite a while as that is a valid url (prefix) The issue is that the url `'s3://altinity-clickhouse-data/nyc_taxi_rides/data/tripdata_parquet'` does not end in a `/`. If you add a `/` it then works great: ``` > CREATE EXTERNAL TABLE nyc_taxi_rides STORED AS PARQUET LOCATION 's3://altinity-clickhouse-data/nyc_taxi_rides/data/tripdata_parquet/'; 0 row(s) fetched. Elapsed 1.624 seconds. ``` BTW this is consistent with a local file system where selecting from a directory that doesn't end in a path works just fine: ```sql -- Write data to `foo` directory: > copy (values(1)) to 'foo/1.parquet'; +-------+ | count | +-------+ | 1 | +-------+ 1 row(s) fetched. Elapsed 0.044 seconds. -- Note the location doesn't end in `/` but it works fine > create external table foo stored as parquet location 'foo'; 0 row(s) fetched. Elapsed 0.022 seconds. > select * from foo; +---------+ | column1 | +---------+ | 1 | +---------+ 1 row(s) fetched. Elapsed 0.132 seconds. ``` ### Describe the solution you'd like I would like this to be less confusing ### Describe alternatives you've considered # Alternate 1: Better Error Message At the very least we can make the message more explicit ("Not found. Hint: if it is a directory the path should end with `/`") # Alternate 2: Preferred It would be even better to automatically add a`/` to the path if the first one was not found and try again I think the trick will be to figure out at what level we should try to add `/` (probably when first creating the ListingTable?) ### Additional context _No response_ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org