On Wed, May 14, 2025 at 04:08:09PM -0400, Alejandro Jimenez wrote:
> 
> 
> On 5/14/25 11:54 AM, Jason Gunthorpe wrote:
> > On Wed, May 14, 2025 at 09:23:49AM +0000, Ankit Soni wrote:
> > > I am experiencing a system hang with a 5-level v2 page table mode, on 
> > > boot.
> > > The NVMe boot drive is not initializing.
> > > Below are the relevant dmesg logs with some prints i had added:
> > > 
> > > [    6.386439] AMD-Vi v2 domain init
> > > [    6.390132] AMD-Vi v2 pt init
> > > [    6.390133] AMD-Vi aperture end last va ffffffffffffff
> > > ...
> > > [   10.315372] AMD-Vi gen pt MAP PAGES iova ffffffffffffe000 paddr 
> > > 19351b000
> > > ...
> > > [   72.171930] nvme nvme0: I/O tag 0 (0000) QID 0 timeout, disable 
> > > controller
> > > [   72.179618] nvme nvme1: I/O tag 24 (0018) QID 0 timeout, disable 
> > > controller
> > > [   72.197176] nvme nvme0: Identify Controller failed (-4)
> > > [   72.203063] nvme nvme1: Identify Controller failed (-4)
> > > [   72.209237] nvme 0000:05:00.0: probe with driver nvme failed with 
> > > error -5
> > > [   72.209336] nvme 0000:44:00.0: probe with driver nvme failed with 
> > > error -5
> > > ...
> > > Timed out waiting for the udev queue to be empty.
> > > 
> > > According to the dmesg logs above, the IOVA for the v2 page table appears
> > > incorrect and is not aligned with domain->geometry.aperture_end. Which
> > > requires domain->geometry.force_aperture = true; to be added at the
> > > appropriate location. Proabably here!
> 
> Thank you for pointing out this issue and its cause. I originally tested on
> a host with SCSI storage, and after your report I tried but couldn't
> reproduce the hang on a Zen4 host with an nvme boot drive. I wanted to see
> if it was a pattern common to NVME, but I suppose it depends on the DMA mask
> chosen by the specific driver.
> 
> Alejandro
> 

Hi,
Can you try with below command line?
"amd_iommu=pgtbl_v2 iommu.forcedac=1"
Indeed it depends on DMA Mask chose by nvme driver. if force_aperture is
not true, iommu driver will use dma_mask in place of end_aperture.

-Ankit


> > 
> > Yes! It got lost, thanks alot!
> > 
> > Jason
> 

Reply via email to