> Date: Fri, 5 Aug 2016 18:19:01 -0600 > From: alex.william...@redhat.com > To: the_cartograp...@hotmail.com > CC: vfio-users@redhat.com > Subject: Re: [vfio-users] Cannot register eventfd with MSI/MSI-X interrupts > > On Wed, 3 Aug 2016 12:47:21 +0000 > chris thompson <the_cartograp...@hotmail.com> wrote: > > > > > > Date: Tue, 2 Aug 2016 16:29:52 -0600 > > > From: alex.william...@redhat.com > > > To: the_cartograp...@hotmail.com > > > CC: vfio-users@redhat.com > > > Subject: Re: [vfio-users] Cannot register eventfd with MSI/MSI-X > > > interrupts > > > > > > On Tue, 2 Aug 2016 09:54:19 -0600 > > > Alex Williamson <alex.william...@redhat.com> wrote: > > > > > > > On Tue, 2 Aug 2016 10:21:30 +0000 > > > > chris thompson <the_cartograp...@hotmail.com> wrote: > > > > > > > > > Hi Alex, > > > > > > > > > > Thanks for your patience, I noticed the count != 0 issue shortly > > > > > after, but I get the same EINVAL error anyway when it is zero - but > > > > > this is because the IRQ does not have a mapping yet. When I change > > > > > the code to map it then unmap it then I get: > > > > > IRQx - successful map then unmap > > > > > MSI - unsuccessful map (and unsuccessful unmap) > > > > > MSIX - successful map then unmap (despite the device still being in > > > > > IRQx mode?) > > > > > Error - successfully map, kernel driver crash on the unmap! > > > > > > I remembered that there is a trick here specific to MSI, your device > > > probably supports more than one MSI vector. MSI is a little bit > > > special, not all platforms support multiple MSI vectors and they were > > > pretty much outdated by the much more flexible MSI-X capability for > > > that reason. To help with this the SET_IRQS ioctl returns negative on > > > error, 0 on success, and a positive value indicating the available > > > vectors to retry with if the requested count is not supported/available. > > > > > > BTW, this should fix the oops you found: > > > https://lkml.org/lkml/2016/8/2/1912 > > > > > > Thanks, > > > Alex Hi Alex, Indeed I realise now that the MSI registration ioctl > > > returns 8, the number of IRQs I would expect (the same as the MSIX ones). > > > The question is, why doesn't the IRQ_INFO ioctl reply count = 8 like the > > > MSIX one? Instead it returns 64 (I thought MSI only went up to 32 too?) > > > config IRQ index 1 > > Info: > > argsz 0x10, flags 0x9, index 0x1, count 0x40, > > Register: > > failed to register interrupt set 1 : 0-63, error 8 Invalid argument config > > IRQ index 2 > > Info: > > argsz 0x10, flags 0x9, index 0x2, count 0x8, > > Register: > > successfully registered interrupt set 2 : 0-7, ret 0 Thanks,Chris > > > > I believe this was fixed in the v3.16 kernel by: > > fd49c81 drivers/vfio/pci: Fix wrong MSI interrupt count > > Your previous oops shows you're running an old v3.13 kernel.
Hi Alex, You are right unfortunately I have to make do with a standard Ubuntu14.04 install (our regular RHEL6.6 doesn't even have VFIO). Glad to hear this is also fixed later and I correctly understood the intent, even if my kernel doesn't do it right. I've made some progress getting the PCIe device talking to my simulated kernel, and have a new PCIe device which I'm trying to work with - an Intel e1000e Ethernet controller (the xhci drivers expose a bug in our simulated PCIe host controller driver code which I haven't sorted yet). Unfortunately when trying to setup the MSI-X interrupts I trigger a bug in the host kernel again: [ 2238.755804] vfio-pci 0000:01:00.0: irq 46 for MSI/MSI-X [ 2238.755812] vfio-pci 0000:01:00.0: irq 47 for MSI/MSI-X [ 2238.755817] vfio-pci 0000:01:00.0: irq 48 for MSI/MSI-X [ 2238.755822] vfio-pci 0000:01:00.0: irq 49 for MSI/MSI-X [ 2238.755826] vfio-pci 0000:01:00.0: irq 50 for MSI/MSI-X [ 2238.755908] BUG: unable to handle kernel paging request at ffffebe000004000 [ 2238.755942] IP: [<ffffffff811a53e6>] kfree+0x56/0x160 [ 2238.755964] PGD 0 [ 2238.755973] Oops: 0000 [#1] SMP [ 2238.755987] Modules linked in: vfio_pci vfio_iommu_type1 vfio nfsv3 nfsv4 cuse autofs4 dm_crypt hp_wmi gpio_ich bnep sparse_keymap rfcomm bluetooth intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm crct 10dif_pclmul crc32_pclmul snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_hwdep snd_pcm snd_page_alloc aesni_intel snd_seq_midi snd_seq_midi_event aes_x86_64 lrw snd_rawmidi gf128mul glue_helper ablk_he lper cryptd snd_seq serio_raw lpc_ich snd_seq_device wmi snd_timer snd soundcore mac_hid shpchp nfsd mei_me tpm_infineon mei auth_rpcgss nfs_acl parport_pc nfs lockd ppdev sunrpc lp fscache parport binfmt_misc btrfs xor raid6_p q libcrc32c i915 video i2c_algo_bit e1000e drm_kms_helper ahci psmouse ptp libahci drm pps_core [ 2238.756292] CPU: 1 PID: 2782 Comm: sim Not tainted 3.13.0-87-generic #133-Ubuntu [ 2238.756318] Hardware name: Hewlett-Packard HP Compaq 8200 Elite SFF PC/1495, BIOS J01 v02.15 11/10/2011 [ 2238.756350] task: ffff8800b65d4800 ti: ffff8800b6688000 task.ti: ffff8800b6688000 [ 2238.756375] RIP: 0010:[<ffffffff811a53e6>] [<ffffffff811a53e6>] kfree+0x56/0x160 [ 2238.756403] RSP: 0018:ffff8800b6689cf0 EFLAGS: 00010286 [ 2238.756421] RAX: ffffebe000004000 RBX: 0000000000100000 RCX: 0000000000000001 [ 2238.756445] RDX: ffffea0000000000 RSI: 00000000000fff37 RDI: 0000000000100000 [ 2238.756469] RBP: ffff8800b6689d08 R08: 0000000000015fa0 R09: ffff880225400000 [ 2238.756493] R10: ffff88022273a000 R11: ffffffffa06ec898 R12: ffffffffffffff38 [ 2238.756517] R13: ffffffffa06ec898 R14: 00000000000fff37 R15: 00000000ffffffff [ 2238.756541] FS: 00007f7147398780(0000) GS:ffff88022e280000(0000) knlGS:0000000000000000 [ 2238.756569] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 2238.756588] CR2: ffffebe000004000 CR3: 00000000b66c1000 CR4: 00000000000407e0 [ 2238.756612] Stack: [ 2238.756620] ffff88021fd37b00 ffffffffffffff38 00000000fffffffb ffff8800b6689d50 [ 2238.756647] ffffffffa06ec898 ffff88022273a000 ffffffffa06f1453 ffff88021f513ee0 [ 2238.756674] 0000000000000000 0000000000000002 00000000fffffffa 00000000ffffffea [ 2238.756702] Call Trace: [ 2238.756714] [<ffffffffa06ec898>] vfio_msi_set_vector_signal+0x88/0x210 [vfio_pci] [ 2238.756742] [<ffffffffa06ecac8>] vfio_msi_set_block+0xa8/0xe0 [vfio_pci] [ 2238.756766] [<ffffffffa06ed446>] vfio_pci_set_msi_trigger+0x216/0x2d0 [vfio_pci] [ 2238.756793] [<ffffffffa06ed9ec>] vfio_pci_set_irqs_ioctl+0x8c/0xa0 [vfio_pci] [ 2238.756818] [<ffffffffa06ebe7d>] vfio_pci_ioctl+0x31d/0xa30 [vfio_pci] [ 2238.756843] [<ffffffff812035e1>] ? fsnotify+0x241/0x320 [ 2238.756863] [<ffffffffa06e3193>] vfio_device_fops_unl_ioctl+0x23/0x30 [vfio] [ 2238.756889] [<ffffffff811d46e0>] do_vfs_ioctl+0x2e0/0x4c0 [ 2238.756910] [<ffffffff811c119c>] ? vfs_write+0x15c/0x1f0 [ 2238.756930] [<ffffffff811d4941>] SyS_ioctl+0x81/0xa0 [ 2238.756949] [<ffffffff811c1b5c>] ? SyS_write+0x7c/0xa0 [ 2238.756969] [<ffffffff8173a3dd>] system_call_fastpath+0x1a/0x1f [ 2238.756989] Code: 00 00 00 80 ff 77 00 00 48 01 d8 48 0f 42 15 42 fc a6 00 48 01 d0 48 ba 00 00 00 00 00 ea ff ff 48 c1 e8 0c 48 c1 e0 06 48 01 d0 <48> 8b 10 80 e6 80 0f 85 e1 00 00 00 49 89 c2 49 8b 02 a8 80 0f [ 2238.758263] RIP [<ffffffff811a53e6>] kfree+0x56/0x160 [ 2238.759464] RSP <ffff8800b6689cf0> [ 2238.760664] CR2: ffffebe000004000 [ 2238.901817] ---[ end trace 2df3e50788542fe4 ]--- Here is a summary of the writes that have been made to the important bits of the device (all in PCI Config space, except the MSI-X table programming): Command : 0x0100 #test BAR0 Command : 0x0103 Command : 0x0100 #test BAR1 Command : 0x0100 #test BAR2 Command : 0x0103 Command : 0x0100 #test BAR3 Command : 0x0103 Command : 0x0100 #test BAR4 Command : 0x0103 Command : 0x0100 #test BAR5 Command : 0x0103 Command : 0x0100 Expansion ROM Address 0xFFFFF800 Expansion ROM Address 0xFE580000 Command : 0x0103 MSI Message Control 0x0080 MSI-X Message Control 0x0004 Pwer Management Control/Status 0xA000 Command : 0x0143 Cache Line Size : 0x10 PCIe Capability Register : 0x01010 Interrupt Line : 0x02 BAR0 : 0x0 BAR1 : 0xC0000 BAR3 : 0xE0000 BAR2 : 0x1000 PCIe Device Control : 0x000F Command : 0x0147 MSI-X Message Control : 0x0004 #Program MSI-X Table, addresses and value for MSI-X 0,1,2 MSI-X Message Control : 0xC004 <- Enable MSI-X interrupts - at this point I try to register the MSI-X interrupts 0-4 with eventfd (the device has 5, even thourgh only 0-2 are progreammed by the simulated kernel driver just now) The driver would then go on to do #Program MSI-X Table masks (all 0xFFFFFFFF) Command : 0x0547 <- disable legacy INTx interrupts (seems a bit late to me?) My simulated kernel is quite happy up until this point, when my simulation process turns into a zombie, and I have to reboot the machine to get access to vfio devices again. Anything obvious here? perhaps a known bug in my rather ancient kernel? Thanks and regards, Chris
_______________________________________________ vfio-users mailing list vfio-users@redhat.com https://www.redhat.com/mailman/listinfo/vfio-users