On Fri, 2012-08-03 at 15:12 -0600, David Ahern wrote: > On 8/3/12 2:21 PM, Alex Williamson wrote: > > On Fri, 2012-08-03 at 11:39 -0600, David Ahern wrote: > >> Hi Alex: > >> > >> Hitting an oops with 3.6-rc1. Backtrace from console attached. git blame > >> for the top function points to ad805758. > > > > Hey David, > > > > Hmm, what's special about your system? I've got an 82576 here and the > > same path works fine. Any way you can get the top of the oops message? > > Thanks, > > > > Alex > > > > Dell R410 I believe. pair of 5620 processors. 3 overlapping screen shots > attached. objdump on pci.o suggests the pdev is NULL: > > /opt/sw/ahern/kernels/kernel.git/drivers/pci/pci.c:2454 > > ret = pci_dev_specific_acs_enabled(pdev, acs_flags); > if (ret >= 0) > return ret > 0; > > if (!pci_is_pcie(pdev)) > 408a: 41 80 7c 24 4a 00 cmpb $0x0,0x4a(%r12) > 4090: 74 e8 je 407a <pci_acs_enabled+0x2a> > > > Perhaps this bug explains the larger the issue which is that device > passthrough in 3.6-rc1 (0d7614f) is broken for me -- config field for > the PCI device does not exist. e.g., > > pcilib: Cannot open /sys/bus/pci/devices/0000:06:10.0/config > lspci: Unable to read the standard configuration space header of device > 0000:06:10.0 > pcilib: Cannot open /sys/bus/pci/devices/0000:06:10.0/config > lspci: Unable to read the standard configuration space header of device > 0000:06:10.0 > failed to find vendor-product id for PCI id "06:10.0" > Failed to claim PCI device 06:10.0 > > git bisect points to: > > 783f157bc5a7fa30ee17b4099b27146bd1b68af4 is the first bad commit > commit 783f157bc5a7fa30ee17b4099b27146bd1b68af4 > Author: Alex Williamson <alex.william...@redhat.com> > Date: Wed May 30 14:19:43 2012 -0600 > > intel-iommu: Make use of DMA quirks and ACS checks in IOMMU groups > > Work around broken devices and adhere to ACS support when determining > IOMMU grouping. > > Signed-off-by: Alex Williamson <alex.william...@redhat.com> > Signed-off-by: Joerg Roedel <joerg.roe...@amd.com> > > :040000 040000 83890398dabbf225fd0f5b3c8c3713a75b3fb5e1 > b674ce2ecb315393a8c6c1ac98b3796d5ba09708 M drivers > > I triggered the oops in a number of the bisect points as well -- in > those cases the machine had to be power cycled.
Is this the chunk that's causing the oops? diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index 7469b53..27d8c97 100644 --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -4133,6 +4133,7 @@ static int intel_iommu_add_device(struct device *dev) PCI_DEVFN(PCI_SLOT(dma_pdev->devfn), 0))); +#if 0 while (!pci_is_root_bus(dma_pdev->bus)) { if (pci_acs_path_enabled(dma_pdev->bus->self, NULL, REQ_ACS_FLAGS)) @@ -4140,6 +4141,7 @@ static int intel_iommu_add_device(struct device *dev) swap_pci_ref(&dma_pdev, pci_dev_get(dma_pdev->bus->self)); } +#endif group = iommu_group_get(&dma_pdev->dev); pci_dev_put(dma_pdev); -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/