Hi Jan Beulich, 

Thank you very much. Please see my inline comments below.

From: Jan Beulich <jbeul...@suse.com>
Sent: 20 April 2022 18:37
To: Naresh Bhat <nare...@marvell.com>
Cc: jul...@xen.org <jul...@xen.org>; sstabell...@kernel.org 
<sstabell...@kernel.org>; xen-devel@lists.xenproject.org 
<xen-devel@lists.xenproject.org>
Subject: [EXT] Re: DOMU: Virtual Function FLR in PCI passthrough is crashing 
 
External Email

----------------------------------------------------------------------
On 20.04.2022 14:48, Naresh Bhat wrote:
> I have the following setup and try to test the Function Level Reset feature.  
> Any suggestions or pointers will be very much helpful.
> 
> DOM0
> Distribution: Ubuntu-20.04.3 (kernel 5.8.0-43)
> Xen version : 4.11.4-pre
> 
> DOMU
> Distribution: Ubuntu-18.04.6 LTS (kernel 5.8.0)
> PCIe device with SRIOV support, VF (Virtual Function) interface connected to 
> DOMU via PCI pass-through
> 
> Issue on DOMU: 
> 1. Enable MSIX on DOMU (We have used the following kernel APIs 
> pci_enable_msix_range, pci_alloc_irq_vectors)
> 2. Execute FLR (Function Level Reset) via sysfs interface on the PCIe 
> passthrough device in DOMU
>    # echo "1" > /sys/bus/pci/devices/<ID>/reset
> 
> The following crash observed 
> 
> [ 4126.391455] BUG: unable to handle page fault for address: ffffc90040029000
> [ 4126.391489] #PF: supervisor write access in kernel mode
> [ 4126.391503] #PF: error_code(0x0003) - permissions violation
> [ 4126.391516] PGD 94980067 P4D 94980067 PUD 16a155067 PMD 16a156067 PTE 
> 80100000a000c075
> [ 4126.391537] Oops: 0003 [#1] SMP NOPTI
> [ 4126.391550] CPU: 0 PID: 971 Comm: bash Tainted: G           OE     5.8.0 #1
> [ 4126.391570] RIP: e030:__pci_write_msi_msg+0x59/0x150
> [ 4126.391580] Code: 8b 50 d8 85 d2 75 31 83 78 fc 03 74 2b f6 47 54 01 74 6e 
> f6 47 55 02 75 1f 0f b7 47 56 c1 e0 04 48 98 48 03 47 60 74 10 8b 16 <89> 10 
> 8b 56 04 89 50 04 8b 56 08 89 50 08 48 8b 03 49 89 44 24 20
> [ 4126.391606] RSP: e02b:ffffc90040407cc0 EFLAGS: 00010286

The RSP related selector value suggests you're talking about a PV DomU.
Such a DomU cannot write the MSI-X table directly, yet at a guess (from
the PTE displayed) that's what the insn does where the crash occurred. I
would guess you've hit yet another place in the kernel where proper PV
abstraction is missing. You may want to check with newer kernels.

[Naresh]: We have tested with latest kernel i.e. 5.17.0 kernel, issue persists.

As to FLR - I guess this operation as a whole needs passing through
pcifront to pciback, such that the operation can be carried out safely
(e.g. to save and restore active MSIs, which is what I infer is being
attempted here, as per the stack trace).

[Naresh]: Any idea when the support will be added to Xen ?

Jan

Reply via email to