[ https://issues.apache.org/jira/browse/ARROW-16680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17632484#comment-17632484 ]
Carl Boettiger commented on ARROW-16680: ---------------------------------------- Wow, thanks Dewey! That looks like black magic to me but I can definitely confirm that it works! Still a bit stuck on the right thing to do in cases where we are providing user-facing packages that rely on arrow functions to access large external data, like you say I don't mind doing this in my scripts but it seems poor form to invisibly impose this on users where it may have side-effects with their other stuff? > [R] Weird R error: Error in > fs___FileSystem__GetTargetInfos_FileSelector(self, x) : ignoring SIGPIPE > signal > -------------------------------------------------------------------------------------------------------------- > > Key: ARROW-16680 > URL: https://issues.apache.org/jira/browse/ARROW-16680 > Project: Apache Arrow > Issue Type: Bug > Components: R > Affects Versions: 8.0.0 > Reporter: Carl Boettiger > Priority: Major > > Okay apologies, this is a bit of a weird error but is annoying the heck out > of me. The following block of all R code, when run with Rscript (or embedded > into any form of Rmd, quarto, knitr doc) produces the error below (at least > most of the time): > > {code:java} > library(arrow) > library(dplyr){code} > {code:java} > Sys.setenv(AWS_EC2_METADATA_DISABLED = "TRUE") > Sys.unsetenv("AWS_ACCESS_KEY_ID") > Sys.unsetenv("AWS_SECRET_ACCESS_KEY") > Sys.unsetenv("AWS_DEFAULT_REGION") > Sys.unsetenv("AWS_S3_ENDPOINT")s3 <- arrow::s3_bucket(bucket = > "scores/parquet", > endpoint_override = "data.ecoforecast.org") > ds <- arrow::open_dataset(s3, partitioning = c("theme", "year")) > ds |> dplyr::filter(theme == "phenology") |> dplyr::collect() > {code} > Gives the error > > > {code:java} > Error in fs___FileSystem__GetTargetInfos_FileSelector(self, x) : > ignoring SIGPIPE signal > Calls: %>% ... <Anonymous> -> fs___FileSystem__GetTargetInfos_FileSelector > {code} > But only when run as a script! When run interactively in an R console, this > code runs just fine. Even as a script the code seems to run fine, but > erroneously seems to be attempting this sigpipe I don't understand. > If the script is executed with litter > ([https://dirk.eddelbuettel.com/code/littler.html)] then it runs fine, since > littler handles sigpipe but Rscripts don't. But I have no idea why the above > code throws a pipe in the first place. Worse, if I choose a different filter > for the above, like "aquatics", it (usually) works without the error. > I have no idea why `fs___FileSystem__GetTargetInfos_FileSelector` results in > this, but would really appreciate any hints on how to avoid this as it makes > it very hard to use arrow in workflows right now! > > thanks for all you do! > -- This message was sent by Atlassian Jira (v8.20.10#820010)