Hey All, So I've been working on a use case where I needed to be able to use the Dataset API from Golang and instead of trying to port all of it to Golang (which would require porting the Compute side too) I decided to create a proof of concept using CGO to just call into the existing C++ code in a similar fashion to how the Java solution is using JNI for the same thing. After proving to myself it works I came up with a question that I figured would be best sent to this mailing list.
When building it out, CGO just needs a C-API exposed for it to work and while there is a C Data interface designed for using Arrow, there is not currently a C Data Interface designed for the Dataset API. As a result, the big question is that if I wanted to contribute the work to the Arrow Repo, should a C Interface for the Dataset API be put as a separate directory and separate build artifact like the JNI interface, or should it just be directly added to and exported from the Dataset library? It's an organizational question because either way it would need to exist on anywhere that the Go code that wants to hit it would be being built, so it's the difference between just needing libarrow_dataset.so (and it's dependencies) or needing that *and* libarrow_dataset_cgo.so/.a, etc. I'm curious what everyone's opinions might be on this so I can get an idea of which direction I should go before trying to put a PR together. Thanks everyone! --Matt Topol