This bug is missing log files that will aid in diagnosing the problem.
While running an Ubuntu kernel (not a mainline or third-party kernel)
please enter the following command in a terminal window:

apport-collect 2040526

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable
to run this command, please add a comment stating that fact and change
the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the
Ubuntu Kernel Team.

** Changed in: linux (Ubuntu)
       Status: New => Incomplete

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/2040526

Title:
  Backport DMABUF functionality

Status in linux package in Ubuntu:
  Incomplete

Bug description:
  SRU Justification:

  [Impact]

  Backport RDMA DMABUF functionality

  Nvidia is working on a high performance networking solution with real
  customers. That solution is being developed using the Ubuntu 22.04 LTS
  distro release and the distro kernel (lowlatency flavour). This
  “dma_buf” patchset consists of upstreamed patches that allow buffers
  to be shared between drivers thus enhancing performance while reducing
  copying of data.

  Our team is currently engaged in the development of a high-performance
  networking solution tailored to meet the demands of real-world
  customers. This cutting-edge solution is being crafted on the
  foundation of Ubuntu 22.04 LTS, utilizing the distribution's kernel,
  specifically the lowlatency flavor.

  At the heart of our innovation lies the transformative "dma_buf"
  patchset, comprising a series of patches that have been integrated
  into the upstream kernel in 5.16 and 5.17. These patches introduce a
  groundbreaking capability: enabling the seamless sharing of buffers
  among various drivers. This not only bolsters the solution's
  performance but also minimizes the need for data copying, effectively
  enhancing efficiency across the board.

  The new functionality is isolated such that existing user will not
  execute these new code paths.

  * First 3 patches adds a new api to the RDMA subsystem that allows drivers to 
get a pinned dmabuf memory
  region without requiring an implementation of the move_notify callback.

  https://lore.kernel.org/all/20211012120903.96933-1-galpr...@amazon.com/

  * The remaining patches add support for DMABUF when creating a devx umem. 
devx umems
  are quite similar to MR's execpt they cannot be revoked, so this uses the 
  dmabuf pinned memory flow. Several mlx5dv flows require umem and cannot
  work with MR. 

  https://lore.kernel.org/all/0-v1-bd147097458e+ede-
  umem_dmabuf_...@nvidia.com/

  [Test Plan]

  SW Configuration:
  • Download CUDA 12.2 run file 
(https://developer.nvidia.com/cuda-downloads?target_os=Linux&target_arch=x86_64&Distribution=Ubuntu&target_version=20.04&target_type=runfile_local)
  • Install using kernel-open i.e. #sh ./cuda_12.2.2_535.104.05_linux.run 
-m=kernel-open
  • Clone perftest from https://github.com/linux-rdma/perftest.
  • cd perftest
  • export LD_LIBRARY_PATH=/usr/local/cuda-12.2/lib64:$LD_LIBRARY_PATH
  • export LIBRARY_PATH=/usr/local/cuda-12.2/lib64:$LIBRARY_PATH
  • run: ./autogen.sh ; ./configure CUDA_H_PATH=/usr/local/cuda/include/cuda.h; 
make

  # Start Server
  $ ./ib_write_bw -d mlx5_2 -F --use_cuda=0 --use_cuda_dmabuf

  #Start Client
  $ ./ib_write_bw -d mlx5_3 -F --use_cuda=1 --use_cuda_dmabuf localhost

  [Where problems could occur?]

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2040526/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to