hi Pearu, For the moment I recommend using arrow::ipc::SerializeSchema to serialize the schema to host memory and then copying that memory to the device
https://github.com/apache/arrow/blob/master/cpp/src/arrow/ipc/writer.h#L221 In some cases there isn't much benefit to putting the schema on the device, and some applications may strictly deal with the schema in CPU memory (i.e. using POSIX shared memory or Plasma to manage shared schemas) One API to copy memory to the device is https://github.com/apache/arrow/blob/master/cpp/src/arrow/gpu/cuda_memory.h#L68 There's probably some APIs we can add to improve usability for this procedure. - Wes On Mon, Aug 27, 2018 at 12:05 PM, Pearu Peterson <[email protected]> wrote: > Hi, > > I have implemented a function that copies host data (through wrapping it > into arrow::Array object) to the gpu device using > arrow::gpu::SerializeRecordBatch: > > ... > #define MY_COLUMN_SCHEMA(DTYPE) ::arrow::schema({arrow::field("data", > DTYPE)}) > > arrow::Status ToRecordBatch(const my_column* column, > std::shared_ptr<arrow::RecordBatch>* out) { > // zero-copy > std::shared_ptr<arrow::Array> arr; > std::shared_ptr<arrow::DataType> dtype = GetDataType(column); > ToArray(column, &arr); > *out = arrow::RecordBatch::Make(MY_COLUMN_SCHEMA(dtype), column->size, > {arr}); > return arrow::Status::OK(); > } > > // Use it on host > arrow::Status ToDevice(const my_column *column, > std::shared_ptr<arrow::gpu::CudaBuffer> *buffer) { > constexpr int kGpuNumber = 0; > arrow::gpu::CudaDeviceManager* manager_; > std::shared_ptr<arrow::gpu::CudaContext> context_; > arrow::gpu::CudaDeviceManager::GetInstance(&manager_); > manager_->GetContext(kGpuNumber, &context_); > std::shared_ptr<arrow::RecordBatch> batch; > auto status = ToRecordBatch(column, &batch); > if (!status.ok()) return status; > return arrow::gpu::SerializeRecordBatch(*batch, context_.get(), buffer); > } > > To implement the reverse of ToDevice, a schema is needed by > arrow::gpu::.ReadRecordBatch. > > Is the schema is included in CudaBuffer object? > If yes, what would be the easiest way to get it? > If not, what is the recommended strategy of passing schema+data to gpu > device, and back? > > Best regards, > Pearu
