[ 
https://issues.apache.org/jira/browse/ARROW-16680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17632484#comment-17632484
 ] 

Carl Boettiger commented on ARROW-16680:
----------------------------------------

Wow, thanks Dewey!  That looks like black magic to me but I can definitely 
confirm that it works!

 

Still a bit stuck on the right thing to do in cases where we are providing 
user-facing packages that rely on arrow functions to access large external 
data, like you say I don't mind doing this in my scripts but it seems poor form 
to invisibly impose this on users where it may have side-effects with their 
other stuff?

> [R] Weird R error: Error in 
> fs___FileSystem__GetTargetInfos_FileSelector(self, x) :    ignoring SIGPIPE 
> signal
> --------------------------------------------------------------------------------------------------------------
>
>                 Key: ARROW-16680
>                 URL: https://issues.apache.org/jira/browse/ARROW-16680
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: R
>    Affects Versions: 8.0.0
>            Reporter: Carl Boettiger
>            Priority: Major
>
> Okay apologies, this is a bit of a weird error but is annoying the heck out 
> of me.  The following block of all R code, when run with Rscript (or embedded 
> into any form of Rmd, quarto, knitr doc) produces the error below (at least 
> most of the time):
>  
> {code:java}
> library(arrow)
> library(dplyr){code}
> {code:java}
> Sys.setenv(AWS_EC2_METADATA_DISABLED = "TRUE")
> Sys.unsetenv("AWS_ACCESS_KEY_ID")
> Sys.unsetenv("AWS_SECRET_ACCESS_KEY")
> Sys.unsetenv("AWS_DEFAULT_REGION")
> Sys.unsetenv("AWS_S3_ENDPOINT")s3 <- arrow::s3_bucket(bucket = 
> "scores/parquet",
>                        endpoint_override = "data.ecoforecast.org")
> ds <- arrow::open_dataset(s3, partitioning = c("theme", "year"))
> ds |> dplyr::filter(theme == "phenology") |> dplyr::collect()
> {code}
> Gives the error
>  
>  
> {code:java}
> Error in fs___FileSystem__GetTargetInfos_FileSelector(self, x) : 
>   ignoring SIGPIPE signal
> Calls: %>% ... <Anonymous> -> fs___FileSystem__GetTargetInfos_FileSelector 
> {code}
> But only when run as a script! When run interactively in an R console, this 
> code runs just fine.  Even as a script the code seems to run fine, but 
> erroneously seems to be attempting this sigpipe I don't understand.  
> If the script is executed with litter 
> ([https://dirk.eddelbuettel.com/code/littler.html)] then it runs fine, since 
> littler handles sigpipe but Rscripts don't.  But I have no idea why the above 
> code throws a pipe in the first place.  Worse, if I choose a different filter 
> for the above, like "aquatics", it (usually) works without the error.  
> I have no idea why `fs___FileSystem__GetTargetInfos_FileSelector` results in 
> this, but would really appreciate any hints on how to avoid this as it makes 
> it very hard to use arrow in workflows right now! 
>  
> thanks for all you do!
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to