> would you be open to working within an R/ subdirectory in the Arrow codebase?
Sure, I'll do whatever is most convenient for the team. A branch sounds fine. Here's one from my end: https://github.com/clarkfitzg/arrow/tree/R/R Thanks for the pointers and encouragement. On Thu, Jul 27, 2017 at 5:49 PM, Wes McKinney <wesmck...@gmail.com> wrote: > hi Clark, > > Cool! Before you go too far down the rabbit hole, would you be open to > working within an R/ subdirectory in the Arrow codebase? It doesn't > have to be ready-to-ship software, and we are happy to set up a branch > in the repository for you to experiment so you don't have to worry > about bothering the master branch or breaking builds. Otherwise > importing your work into the project later will become more > complicated and require the Arrow PMC to do some paperwork: > http://incubator.apache.org/ip-clearance/ . > > I am happy to be available to answer questions on the mailing list, or > offline, or discussions in JIRA or on GitHub pull requests. I am sure > that Uwe and the other C++ developers will be happy to be available. > > To get some basics off the ground, the essentials are being able to > convert one or more record batches into an R data frame, and back. > This is what we did in > > https://github.com/apache/arrow/blob/master/cpp/src/ > arrow/python/arrow_to_pandas.h > https://github.com/apache/arrow/blob/master/cpp/src/ > arrow/python/pandas_to_arrow.h > > We have thin bindings in Cython (which is similar to Rcpp) that make > this callable from Python. > > What Hadley and I put together quickly for Feather last year was > effectively a single Arrow record batch converting to and from pandas > or R data frames. In Arrow, in practice you may be working with a > table in many smaller chunks. > > Looking forward to getting this off the ground! > > Thanks, > Wes > > On Thu, Jul 27, 2017 at 7:40 PM, Clark Fitzgerald <clarkfi...@gmail.com> > wrote: > > I've got at least a "hello world" for R / Arrow bindings in progress. > > https://github.com/clarkfitzg/Rarrow > > > > Over the next couple weeks I plan to spend some time looking at the Arrow > > C++ and Python sources and write a few bindings by hand, then think about > > how to automatically generate bindings from the C++. Several approaches > are > > possible, Rffi / rdyncall, Rcpp modules, or RCodegen / RCIndex leveraging > > Clang. Not sure which, if any, will work. > > > > I'm a beginner in C++. It would be very helpful if someone was available > to > > answer questions on the C++ Arrow codebase, since I'd rather not email > the > > whole dev list for this. > > > > Thanks, > > Clark >