In ARROW-8301 [1] and elsewhere we've been discussing how to
communicate what amounts to a sequence of arrays or a sequence of
RecordBatch objects using the C data interface.

Example use cases:

* Returning a sequence of record / row batches from a database driver
* Sending a C++ arrow::ChunkedArray or arrow::Table to a consumer
using only the C interface

Applications could define their own custom iterator interfaces to
communicate what amounts to a sequence of the ArrowArray C interface
objects, but it is likely a common enough use case to have an
off-the-shelf solution so that we can support this solution in our
reference libraries (e.g. Arrow C++, pyarrow, Arrow R)

I suggested a C structure as follows

struct ArrowArrayStream {
  void (*get_schema)(struct ArrowSchema*);
  // Non-zero return value indicates an error?
  int (*get_next)(struct ArrowArray*);
  void (*get_error)(... ERROR HANDLING TODO );
  void (*release)(struct ArrowArrayStream*);
  void* private_data;
};

The producer would populate this object with pointers to its
implementations of these functions.

Thoughts about this?

Thanks,
Wes

[1]: https://issues.apache.org/jira/browse/ARROW-8301

Reply via email to