alamb opened a new issue, #16299:
URL: https://github.com/apache/datafusion/issues/16299

   ### Is your feature request related to a problem or challenge?
   
   Some S3 public buckets, such as the clickbench public datasets bucket, do 
not require authentication
   
   Other engines like ClickBench allow you to access these without providing 
any credentials: 
https://clickhouse.com/docs/engines/table-engines/integrations/s3
   
   ```sql
   CREATE TABLE s3_engine_table (name String, value UInt32)
       
ENGINE=S3('s3://clickhouse-public-datasets/hits_compatible/hits.parquet', 
'CSV', 'gzip')
   ```
   
   However, datafusion-cli requires you to provide credentials in this case:
   
   ```shell
   datafusion-cli
   ```
   
   ```sql
   DataFusion CLI v47.0.0
   > CREATE EXTERNAL TABLE hits
   STORED AS PARQUET LOCATION 
's3://clickhouse-public-datasets/hits_compatible/hits.parquet' 
OPTIONS(aws.region 'eu-west-1');
   Object Store error: Generic S3 error: the credential provider was not enabled
   ```
   
   
   ### Describe the solution you'd like
   
   I would like the ability to access the public datasets without providing 
credentials
   
   This is supported via this setting in the underlying builder: 
https://docs.rs/object_store/0.12.0/object_store/aws/struct.AmazonS3Builder.html#method.with_skip_signature
   
   
   ### Describe alternatives you've considered
   
   I would like to be able to do 
   
   ```sql
   > CREATE EXTERNAL TABLE hits
   STORED AS PARQUET LOCATION 
's3://clickhouse-public-datasets/hits_compatible/hits.parquet' 
OPTIONS(aws.skip_signature true, aws.region 'eu-central-1');
   ```
   
   And maybe also this (without any signature at all)
   ```sql
   > CREATE EXTERNAL TABLE hits
   STORED AS PARQUET LOCATION 
's3://clickhouse-public-datasets/hits_compatible/hits.parquet' 
OPTIONS(aws.region 'eu-central-1');
   ```
   
   ### Additional context
   
   _No response_


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to