Hi all, I am helping resolve this GH issue [1] with this PR [2], where user wants to use `CRecordBatch.column_data()` method from Cython to access the underlying `CArrayData` but `column_data()` is not exposed in `CRecordBatch`. There is a workaround to access the `CArrayData` [3]. Nevertheless, my reasoning for exposing the `column_data()` method is because the `CArray` type already has the `data()` method that gives access to `CArrayData`, so this PR would be giving the same access to `CRecordBatch` in a more direct manner. Also, `CRecordBatch` already exposes several other public methods to access underlying structures such as schema and column.
I would appreciate comments/suggestions if this solution is acceptable? This relates to the larger question about how much of the public C++ API should be exposed in Cython. ~Eduardo [1] https://github.com/apache/arrow/issues/11523 [2] https://github.com/apache/arrow/pull/11527 [3] https://github.com/apache/arrow/pull/11527#issuecomment-949994868 On Wed, Aug 25, 2021 at 12:15 PM Antoine Pitrou <anto...@python.org> wrote: > > Le 25/08/2021 à 17:27, Joris Van den Bossche a écrit : > > > https://github.com/rapidsai/cudf/blob/be25a30ca20f3135f341c51b36cb075b376d5def/python/cudf/cudf/_lib/cpp/io/types.pxd#L9 > > > > Here they are doing `from pyarrow.includes.libarrow cimport > > CRandomAccessFile` (CRandomAccessFile is the cython equivalent of a > public > > C++ API, and thus also public?), but would we recommend `from pyarrow.lib > > cimport CRandomAccessFile` instead? > > Although for imports from `pyarrow.includes.libarrow_cuda` that would not > > be possible. > > Ah, that's a good point! Then we can make it official to use > `pyarrow.includes.*` when importing C++ APIs. > > Regards > > Antoine. >