Hi all,

I am helping resolve this GH issue [1] with this PR [2], where user wants
to use `CRecordBatch.column_data()` method from Cython to access the
underlying `CArrayData`
but `column_data()` is not exposed in `CRecordBatch`. There is a workaround
to access the `CArrayData` [3]. Nevertheless, my reasoning for exposing the
`column_data()` method is because the `CArray` type already has the
`data()` method that gives access to `CArrayData`, so this PR would be
giving the same access to `CRecordBatch` in a more direct manner. Also,
`CRecordBatch` already exposes several other public methods to access
underlying structures such as schema and column.

I would appreciate comments/suggestions if this solution is acceptable?
This relates to the larger question about how much of the public C++ API
should be exposed in Cython.

~Eduardo

[1] https://github.com/apache/arrow/issues/11523
[2] https://github.com/apache/arrow/pull/11527
[3] https://github.com/apache/arrow/pull/11527#issuecomment-949994868


On Wed, Aug 25, 2021 at 12:15 PM Antoine Pitrou <anto...@python.org> wrote:

>
> Le 25/08/2021 à 17:27, Joris Van den Bossche a écrit :
> >
> https://github.com/rapidsai/cudf/blob/be25a30ca20f3135f341c51b36cb075b376d5def/python/cudf/cudf/_lib/cpp/io/types.pxd#L9
> >
> > Here they are doing `from pyarrow.includes.libarrow cimport
> > CRandomAccessFile` (CRandomAccessFile is the cython equivalent of a
> public
> > C++ API, and thus also public?), but would we recommend `from pyarrow.lib
> > cimport CRandomAccessFile` instead?
> > Although for imports from `pyarrow.includes.libarrow_cuda` that would not
> > be possible.
>
> Ah, that's a good point!  Then we can make it official to use
> `pyarrow.includes.*` when importing C++ APIs.
>
> Regards
>
> Antoine.
>

Reply via email to