On Wed, Jul 19, 2023 at 7:03 AM Alistair Francis <alistai...@gmail.com> wrote: > > On Sat, Jul 15, 2023 at 7:14 PM Atish Patra <ati...@atishpatra.org> wrote: > > > > On Fri, Jul 14, 2023 at 5:29 AM Conor Dooley <co...@kernel.org> wrote: > > > > > > On Fri, Jul 14, 2023 at 11:19:34AM +0100, Conor Dooley wrote: > > > > On Fri, Jul 14, 2023 at 10:00:19AM +0530, Anup Patel wrote: > > > > > > > > > > > OpenSBI v1.3 > > > > > > > ____ _____ ____ _____ > > > > > > > / __ \ / ____| _ \_ _| > > > > > > > | | | |_ __ ___ _ __ | (___ | |_) || | > > > > > > > | | | | '_ \ / _ \ '_ \ \___ \| _ < | | > > > > > > > | |__| | |_) | __/ | | |____) | |_) || |_ > > > > > > > \____/| .__/ \___|_| |_|_____/|___/_____| > > > > > > > | | > > > > > > > |_| > > > > > > > > > > > > > > init_coldboot: ipi init failed (error -1009) > > > > > > > > > > > > > > Just to note, because we use our own firmware that vendors in > > > > > > > OpenSBI > > > > > > > and compiles only a significantly cut down number of files from > > > > > > > it, we > > > > > > > do not use the fw_dynamic etc flow on our hardware. As a result, > > > > > > > we have > > > > > > > not tested v1.3, nor do we have any immediate plans to change our > > > > > > > platform firmware to vendor v1.3 either. > > > > > > > > > > > > > > I unless there's something obvious to you, it sounds like I will > > > > > > > need to > > > > > > > go and bisect OpenSBI. That's a job for another day though, given > > > > > > > the > > > > > > > time. > > > > > > > > > > > > > > > > > The real issue is some CPU/HART DT nodes marked as disabled in the > > > > > DT passed to OpenSBI 1.3. > > > > > > > > > > This issue does not exist in any of the DTs generated by QEMU but some > > > > > of the DTs in the kernel (such as microchip and SiFive board DTs) have > > > > > the E-core disabled. > > > > > > > > > > I had discovered this issue in a totally different context after the > > > > > OpenSBI 1.3 > > > > > release happened. This issue is already fixed in the latest OpenSBI > > > > > by the > > > > > following commit c6a35733b74aeff612398f274ed19a74f81d1f37 ("lib: > > > > > utils: > > > > > Fix sbi_hartid_to_scratch() usage in ACLINT drivers"). > > > > > > > > Great, thanks Anup! I thought I had tested tip-of-tree too, but > > > > obviously not. > > > > > > > > > I always assumed that Microchip hss.bin is the preferred BIOS for the > > > > > QEMU microchip-icicle-kit machine but I guess that's not true. > > > > > > > > Unfortunately the HSS has not worked in QEMU for a long time, and while > > > > I would love to fix it, but am pretty stretched for spare time to begin > > > > with. > > > > I usually just do direct kernel boots, which use the OpenSBI that comes > > > > with QEMU, as I am sure you already know :) > > > > > > > > > At this point, you can either: > > > > > 1) Use latest OpenSBI on QEMU microchip-icicle-kit machine > > > > > > I forgot to reply to this point, wondering what should be done with > > > QEMU. Bumping to v1.3 in QEMU introduces a regression here, regardless > > > of whether I can go and build a fixed version of OpenSBI. > > > > > FYI: The no-map fix went in OpenSBI v1.3. Without the upgrade, any > > user using the latest kernel (> v6.4) > > may hit those random linear map related issues (in hibernation or EFI > > booting path). > > > > There are three possible scenarios: > > > > 1. Upgrade to OpenSBI v1.3: Any user of microchip-icicle-kit machine > > or sifive fu540 machine users > > may hit this issue if the device tree has the disabled hart (e core). > > 2. No upgrade to OpenSBI v1.2. Any user using hibernation or UEFI may > > have issues [1] > > 3. Include a non-release version OpenSBI in Qemu with the fix as an > > exception. > > > > #3 probably deviates from policy and sets a bad precedent. So I am not > > advocating for it though ;) > > For both #1 & #2, the solution would be to use the latest OpenSBI in > > -bios argument instead of the stock one. > > I could be wrong but my guess is the number of users facing #2 would > > be higher than #1. > > Thanks for that info Atish! > > We are stuck in a bad situation. > > The best solution would be if OpenSBI can release a 1.3.1, @Anup Patel > do you think you could do that?
OpenSBI has a major number and minor number in the version but it does not have release/patch number so best would be to treat OpenSBI vX.Y.Z as bug fixes on-top-of OpenSBI vX.Y. In other words, supervisor software won't be able to differentiate between OpenSBI vX.Y.Z and OpenSBI vX.Y using sbi_get_impl_version(). There are only three commits between the ACLINT fix and OpenSBI v1.3 so as one-of case I will go ahead create OpenSBI v1.3.1 containing only four commits on-top of OpenSBI v1.3 Does this sound okay ? > > Otherwise I think we should stick with OpenSBI 1.3. Considering that > it fixes UEFI boot issues for the virt board (which would be the most > used) it seems like a best call to make. People using the other boards > are unfortunately stuck building their own OpenSBI release. > > If there is no OpenSBI 1.3.1 release we should add something to the > release notes. @Conor Dooley are you able to give a clear sentence on > how the boot fails? > > Alistair > > > > > [1] > > https://lore.kernel.org/linux-riscv/20230625140931.1266216-1-songshuaish...@tinylab.org/ > > > > > 2) Ensure CPU0 DT node is enabled in DT when booting on QEMU > > > > > microchip-icicle-kit machine with OpenSBI 1.3 > > > > > > > > Will OpenSBI disable it? If not, I think option 2) needs to be remove > > > > the DT node. I'll just use tip-of-tree myself & up to the > > > > > > Clearly didn't finish this comment. It was meant to say "up to the QEMU > > > maintainers what they want to do on the QEMU side of things". > > > > > > Thanks, > > > Conor. > > > > > > > > -- > > Regards, > > Atish > > Regards, Anup