Carl Boettiger created ARROW-16620: -------------------------------------- Summary: open_dataset fails to open single compressed csv Key: ARROW-16620 URL: https://issues.apache.org/jira/browse/ARROW-16620 Project: Apache Arrow Issue Type: Bug Reporter: Carl Boettiger
The following fails: {code:java} bucket <- s3_bucket("targets/aquatics", endpoint_override="data.ecoforecast.org") x <- open_dataset(bucket$path("aquatics-targets.csv.gz"), format="csv") {code} This is surprising since pointing to an individual parquet file path is fine: {code:java} bucket <- s3_bucket("scores/parquet/aquatics/2022", endpoint_override="data.ecoforecast.org") x <- open_dataset(bucket$path("aquatics-2022-05-18-climatology.parquet")) {code} Maybe related to discussion in https://issues.apache.org/jira/browse/ARROW-15060 or maybe not? In this context I'm thinking only about read. The above examples use public buckets so should be reproducible with no credentials. -- This message was sent by Atlassian Jira (v8.20.7#820007)