Hi Bin, Bin Meng <bmeng...@gmail.com> writes:
> Hi Punit, > > On Sat, Aug 29, 2020 at 8:30 AM Punit Agrawal <punitagra...@gmail.com> wrote: >> >> Hi, >> >> I get the following errors when booting Linux from an ADATA XPG SX8200 >> NVMe on a RockPro64. >> >> [ 3.705205] rockchip-pcie f8000000.pcie: unexpected IRQ, INT0 >> [ 3.705226] rockchip-pcie f8000000.pcie: unexpected IRQ, INT0 >> [ 3.705247] rockchip-pcie f8000000.pcie: unexpected IRQ, INT0 >> [ 3.705331] rockchip-pcie f8000000.pcie: unexpected IRQ, INT0 >> [ 3.705352] rockchip-pcie f8000000.pcie: unexpected IRQ, INT0 >> [ 3.705373] rockchip-pcie f8000000.pcie: unexpected IRQ, INT0 >> >> At which point boot hangs. Has anybody come across these errors when >> using NVMe? >> >> Using an alternate device (sd card) to load the kernel / initrd doesn't >> cause the issue and the drive works fine when used as a root device in >> Linux subsequently. >> >> On digging further, I found that uboot exits with the NVME interrupt >> line (PCI legacy interrupt) active when making any access to the >> device. Even just running "nvme scan" leads to the active interrupt >> line. >> >> After sprinkling some prints in the uboot NVMe driver, it seems that the >> interrupt goes active right at the beginning of setting up the IO queues >> (nvme_setup_io_queues). This is also the first time the admin queue is >> used; when issuing the command to setup the number of queues >> (NVME_FEAT_NUM_QUEUES). For some reason, updating the CQ head doorbell >> doesn't clear the interrupt. >> >> The active interrupt doesn't bother uboot as it ignores the device >> interrupt but causes an issue latter when linux boots. >> >> Has anybody faced similar issues with NVMe and uboot? Any idea on how to >> stop the interrupt line from triggering? Or de-activating it on exit? >> >> Let me know if there's anything I can provide to help debug the >> problem. Also, happy to try any patches or suggestions. >> > > Is this a specific behavior of the NVMe card you are using? Could you > please switch to another card for testing? I suspect this behaviour is down to the ADATA NVMe card but I don't have any others at hand to test. Bought this as a reasonably priced addition to personal computing environment - I was hoping to avoid having to buy another one. Thinking about it, I can try hacking the Linux driver to use legacy interrupts. Maybe it can help identify hardware vs software issue. I will give this a shot. Let me know if you think of anything else I should try. Thanks, Punit