It appears that the reason verification-needed-focal is applied here is
because these patches were included in the linux-intel flavor, whose
description says "A kernel image for Intel IOTG devices." I'm not sure
what the expectations are for verifying bugs with that flavor - should
they all be done on the target hardware? If so, I do not have access to
the hardware to do so.

For this specific issue, I'll go ahead and mark verified with the
following justification:

 (1) The only consumer of the IB Peer Memory interface at this time is
the nvidia driver stack, and we do not appear to provide pre-compiled
nvidia drivers for the -intel flavor at this time. Now, a user could
install an nvidia-dkms package and build their own modules but,

 (2) This appears to be the first version of linux-intel in the focal
series, so it can not possibly be a regression against an earlier
version.


** Tags removed: verification-needed-focal
** Tags added: verification-done-focal

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1947206

Title:
  Updates to ib_peer_memory requested by Nvidia

Status in linux package in Ubuntu:
  Fix Released
Status in linux source package in Focal:
  Fix Committed
Status in linux source package in Hirsute:
  Fix Released
Status in linux source package in Impish:
  Fix Released

Bug description:
  [Impact]
  Nvidia notified me via private email that they'd discovered some issues with 
the ib_peer_memory patch we are carrying in hirsute/impish and sent me a patch 
intended to resolve them. My knowledge of these changes is limited to what is 
mentioned in the commit message:

  - Allow clients to opt out of unmap during invalidation
  - Fix some bugs in the sequencing of mlx5 MRs
  - Enable ATS for peer memory

  [Test Case]
  ib_write_bw from the perftest package, rebuilt with CUDA support, can be used 
as a smoke test of this feature. I'll attach a sample test script here. I've 
verified this test passes with the kernels in the archive, and continues to 
pass with the provided patch applied.

  [Fix]
  Nvidia has emailed me fixes for both trees. They are not currently available 
in a public tree elsewhere, though I'm told at some point they should end up in 
a branch here:
    https://git.kernel.org/pub/scm/linux/kernel/git/leon/linux-rdma.git/

  [What could go wrong]
  The only known use case for ib_peer_memory are Nvidia GPU users making use of 
the GPU PeerDirect feature where GPUs can share memory with one another over an 
Infiniband network. Bugs here could cause problems (hangs, crashes, corruption) 
with such workloads.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1947206/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to