Keith Hughitt created ARROW-7825: ------------------------------------ Summary: Have arrow::read_parquet respect options(stringsAsFactors = FALSE) Key: ARROW-7825 URL: https://issues.apache.org/jira/browse/ARROW-7825 Project: Apache Arrow Issue Type: Improvement Components: R Affects Versions: 0.16.0 Environment: Linux 64-bit 5.4.15 Reporter: Keith Hughitt
Same issue as reported for feather::read_feather (https://issues.apache.org/jira/browse/ARROW-7823); For the R arrow package, the "read_parquet()" function currently does not respect "options(stringsAsFactors = FALSE)", leading to unexpected/inconsistent behavior. *Example:* {code:java} library(arrow) library(readr) options(stringsAsFactors = FALSE) write_tsv(head(iris), 'test.tsv') write_parquet(head(iris), 'test.parquet') head(read.delim('test.tsv', sep='\t')$Species) # [1] "setosa" "setosa" "setosa" "setosa" "setosa" "setosa" head(read_tsv('test.tsv', col_types = cols())$Species) # [1] "setosa" "setosa" "setosa" "setosa" "setosa" "setosa" head(read_parquet('test.parquet')$Species) # [1] setosa setosa setosa setosa setosa setosa # Levels: setosa versicolor virginica {code} *Versions:* - R 3.6.2 - arrow_0.15.1.9000 -- This message was sent by Atlassian Jira (v8.3.4#803005)