Hi Antoine,

This sounds like a reasonable path forward -- thank you for working on
this. I'm at a conference this week, but I will review the PR and give
feedback as soon as I can, probably tomorrow (Friday).

My main couple of initial questions are:

* If it's possible to avoid extra overhead during Buffer construction
to set up these abstractions. I'll look more closely at the patch
* The consequences of making data() uncallable on non-CPU buffers.
Perhaps we should simply document the recommended code for invoking
CUDA kernels and other CUDA-functions on GPU buffers (or non-CPU
buffers generally). Should there be a data_unsafe() method or similar
(sorry if this is already addressed in the PR)?

- Wes

On Tue, Jan 28, 2020 at 5:51 AM Antoine Pitrou <anto...@python.org> wrote:
>
>
> Hello,
>
> I've submitted a PR which exposes a new C++ abstraction layer:
> https://github.com/apache/arrow/pull/6295
>
> The goal is to allow safe handling of buffers residing on different
> devices (the CPU, a GPU...). The layer exposes two interfaces:
>
> * the Device interface exposes information a particular memory-holding
> device
> * the MemoryManager allows allocating, copying, reading or writing
> memory located on a particular device
>
> The Buffer API is modified so that calling data() fails on non-CPU
> buffers. A separate address() method returns the buffer address as an
> integer, and is allowed on any buffer.
>
> The API provides convenience functions to view or copy a buffer from one
> device to the other. For example, a on-GPU buffer can be copied to the
> CPU, and in some situations a zero-copy CPU view can also be created
> (depending on the GPU capabilities and how the GPU memory was allocated).
>
> An example use in the PR is IPC. On the write side, a new
> SerializeRecordBatch overload takes a MemoryManager argument and is able
> to serialize data either to any kind of memory (CPU, GPU). On the read
> side, ReadRecordBatch now works on any kind of input buffer, and returns
> record batches backed by either CPU or GPU memory.
>
> It introduces a slight complexity in the CUDA namespace, since there are
> both `CudaContext` and `CudaMemoryManager` classes. We could solve this
> by merging the two concepts (but doing so may break compatibility for
> existing users of CUDA).
>
> Regards
>
> Antoine.

Reply via email to