> -----Original Message-----
> From: Elena Agostini <eagost...@nvidia.com>
> Sent: Tuesday, September 7, 2021 01:23
> To: Wang, Haiyue <haiyue.w...@intel.com>; Jerin Jacob <jerinjac...@gmail.com>
> Cc: NBU-Contact-Thomas Monjalon <tho...@monjalon.net>; Jerin Jacob 
> <jer...@marvell.com>; dpdk-dev
> <dev@dpdk.org>; Stephen Hemminger <step...@networkplumber.org>; David Marchand
> <david.march...@redhat.com>; Andrew Rybchenko 
> <andrew.rybche...@oktetlabs.ru>; Honnappa Nagarahalli
> <honnappa.nagaraha...@arm.com>; Yigit, Ferruh <ferruh.yi...@intel.com>; 
> techbo...@dpdk.org
> Subject: RE: [dpdk-dev] [RFC PATCH v2 0/7] heterogeneous computing library
> 
> 
> 
> > -----Original Message-----
> > From: Wang, Haiyue <haiyue.w...@intel.com>
> > Sent: Monday, September 6, 2021 7:15 PM
> > To: Elena Agostini <eagost...@nvidia.com>; Jerin Jacob
> > <jerinjac...@gmail.com>
> > Cc: NBU-Contact-Thomas Monjalon <tho...@monjalon.net>; Jerin Jacob
> > <jer...@marvell.com>; dpdk-dev <dev@dpdk.org>; Stephen Hemminger
> > <step...@networkplumber.org>; David Marchand
> > <david.march...@redhat.com>; Andrew Rybchenko
> > <andrew.rybche...@oktetlabs.ru>; Honnappa Nagarahalli
> > <honnappa.nagaraha...@arm.com>; Yigit, Ferruh <ferruh.yi...@intel.com>;
> > techbo...@dpdk.org
> > Subject: RE: [dpdk-dev] [RFC PATCH v2 0/7] heterogeneous computing
> > library
> >
> >
> > > -----Original Message-----
> > > From: Elena Agostini <eagost...@nvidia.com>
> > > Sent: Tuesday, September 7, 2021 00:11
> > > To: Jerin Jacob <jerinjac...@gmail.com>
> > > Cc: Wang, Haiyue <haiyue.w...@intel.com>; NBU-Contact-Thomas
> > Monjalon
> > > <tho...@monjalon.net>; Jerin Jacob <jer...@marvell.com>; dpdk-dev
> > > <dev@dpdk.org>; Stephen Hemminger <step...@networkplumber.org>;
> > David
> > > Marchand <david.march...@redhat.com>; Andrew Rybchenko
> > > <andrew.rybche...@oktetlabs.ru>; Honnappa Nagarahalli
> > > <honnappa.nagaraha...@arm.com>; Yigit, Ferruh
> > > <ferruh.yi...@intel.com>; techbo...@dpdk.org
> > > Subject: RE: [dpdk-dev] [RFC PATCH v2 0/7] heterogeneous computing
> > > library
> > >
> > >
> > >
> >
> >
> > > > >
> > > > > I'd like to introduce (with a dedicated option) the memory API in
> > > > > testpmd to provide an example of how to TX/RX packets using device
> > > > memory.
> > > >
> > > > Not sure without embedding sideband communication mechanism how
> > it
> > > > can notify to GPU and back to CPU. If you could share the example
> > > > API sequence that helps to us understand the level of coupling with
> > testpmd.
> > > >
> > >
> > > There is no need of communication mechanism here.
> > > Assuming there is not workload to process network packets (to not
> > > complicate things), the steps are:
> > > 1) Create a DPDK mempool with device external memory using the hcdev
> > > (or gpudev) library
> > > 2) Use that mempool to tx/rx/fwd packets
> > >
> > > As an example, you look at my l2fwd-nv application here:
> > > https://github.com/NVIDIA/l2fwd-nv
> > >
> >
> > To enhance the 'rte_extmem_register' / 'rte_pktmbuf_pool_create_extbuf'
> > ?
> >
> 
> The purpose of these two functions is different.
> Here DPDK allows the user to use any kind of memory to rx/tx packets.
> It's not about allocating memory.

> 
> Maybe I'm missing the point here: what's the main objection in having a GPU 
> library?

Exactly. ;-)

Maybe a real device code is worth for people to get the whole picture.

> 
> >         if (l2fwd_mem_type == MEM_HOST_PINNED) {
> >                 ext_mem.buf_ptr = rte_malloc("extmem", ext_mem.buf_len, 0);
> >                 CUDA_CHECK(cudaHostRegister(ext_mem.buf_ptr,
> > ext_mem.buf_len, cudaHostRegisterMapped));
> >                 void *pDevice;
> >                 CUDA_CHECK(cudaHostGetDevicePointer(&pDevice,
> > ext_mem.buf_ptr, 0));
> >                 if (pDevice != ext_mem.buf_ptr)
> >                         rte_exit(EXIT_FAILURE, "GPU pointer does not match 
> > CPU
> > pointer\n");
> >         } else {
> >                 ext_mem.buf_iova = RTE_BAD_IOVA;
> >                 CUDA_CHECK(cudaMalloc(&ext_mem.buf_ptr,
> > ext_mem.buf_len));
> >                 if (ext_mem.buf_ptr == NULL)
> >                         rte_exit(EXIT_FAILURE, "Could not allocate GPU 
> > memory\n");
> >
> >                 unsigned int flag = 1;
> >                 CUresult status = cuPointerSetAttribute(&flag,
> > CU_POINTER_ATTRIBUTE_SYNC_MEMOPS, (CUdeviceptr)ext_mem.buf_ptr);
> >                 if (CUDA_SUCCESS != status) {
> >                         rte_exit(EXIT_FAILURE, "Could not set SYNC MEMOP 
> > attribute
> > for GPU memory at %llx\n", (CUdeviceptr)ext_mem.buf_ptr);
> >                 }
> >                 ret = rte_extmem_register(ext_mem.buf_ptr, ext_mem.buf_len,
> > NULL, ext_mem.buf_iova, GPU_PAGE_SIZE);
> >                 if (ret)
> >                         rte_exit(EXIT_FAILURE, "Could not register GPU 
> > memory\n");
> >         }
> >         ret = rte_dev_dma_map(rte_eth_devices[l2fwd_port_id].device,
> > ext_mem.buf_ptr, ext_mem.buf_iova, ext_mem.buf_len);
> >         if (ret)
> >                 rte_exit(EXIT_FAILURE, "Could not DMA map EXT memory\n");
> >         mpool_payload = rte_pktmbuf_pool_create_extbuf("payload_mpool",
> > l2fwd_nb_mbufs,
> >                                                                             
> >             0, 0,
> ext_mem.elt_size,
> >
> rte_socket_id(),
> > &ext_mem, 1);
> >         if (mpool_payload == NULL)
> >                 rte_exit(EXIT_FAILURE, "Could not create EXT memory
> > mempool\n");
> >
> >

Reply via email to