[ https://issues.apache.org/jira/browse/ARROW-4512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Rok Mihevc updated ARROW-4512: ------------------------------ External issue URL: https://github.com/apache/arrow/issues/21063 > [R] Stream reader/writer API that takes socket stream > ----------------------------------------------------- > > Key: ARROW-4512 > URL: https://issues.apache.org/jira/browse/ARROW-4512 > Project: Apache Arrow > Issue Type: Improvement > Components: R > Affects Versions: 0.12.0, 0.14.1, 1.0.0 > Reporter: Hyukjin Kwon > Assignee: Dewey Dunnington > Priority: Major > Fix For: 8.0.0 > > > I have been working on Spark integration with Arrow. > I realised that there are no ways to use socket as input to use Arrow stream > format. For instance, > I want to something like: > {code} > connStream <- socketConnection(port = 9999, blocking = TRUE, open = "wb") > rdf_slices <- # a list of data frames. > stream_writer <- NULL > tryCatch({ > for (rdf_slice in rdf_slices) { > batch <- record_batch(rdf_slice) > if (is.null(stream_writer)) { > stream_writer <- RecordBatchStreamWriter(connStream, batch$schema) # > Here, looks there's no way to use socket. > } > stream_writer$write_batch(batch) > } > }, > finally = { > if (!is.null(stream_writer)) { > stream_writer$close() > } > }) > {code} > Likewise, I cannot find a way to iterate the stream batch by batch > {code} > RecordBatchStreamReader(connStream)$batches() # Here, looks there's no way > to use socket. > {code} > This looks easily possible in Python side but looks missing in R APIs. -- This message was sent by Atlassian Jira (v8.20.10#820010)