Le 21/12/2023 à 11:33, Matthias Schiffer a écrit : > [Vous ne recevez pas souvent de courriers de > matthias.schif...@ew.tq-group.com. Découvrez pourquoi ceci est important à > https://aka.ms/LearnAboutSenderIdentification ] > > On Wed, 2023-12-20 at 14:55 +0000, Christophe Leroy wrote: >>> Le 19/12/2023 à 14:34, Matthias Schiffer a écrit : >>>>> [Vous ne recevez pas souvent de courriers de >>>>> matthias.schif...@ew.tq-group.com. Découvrez pourquoi ceci est important >>>>> à https://aka.ms/LearnAboutSenderIdentification ] >>>>> >>>>> On Mon, 2023-12-18 at 19:48 +0000, Christophe Leroy wrote: >>>>>>> Hi Matthias, >>>>>>> >>>>>>> Le 18/12/2023 à 14:48, Matthias Schiffer a écrit : >>>>>>>>> Hi all, >>>>>>>>> >>>>>>>>> I'm currently in the process of porting our ancient TQM5200 SoM to a >>>>>>>>> modern kernel, and I've >>>>>>>>> identified a number of regressions that cause early boot failures >>>>>>>>> (before the UART console has been >>>>>>>>> initialized) with arch/powerpc/configs/52xx/tqm5200_defconfig. >>>>>>> >>>>>>> "modern" kernel ==> which version ? >>>>> >>>>> Hi Christophe, >>>>> >>>>> I was testing with torvalds/master as of yesterday, and bisected >>>>> everything from 4.14 to identify >>>>> the commits related to the issues. For my current project, 6.1.y or 6.6.y >>>>> will likely be our kernel >>>>> of choice, but I'd also like to get mainline to a working state again if >>>>> possible. >>>>> >>>>>>> >>>>>>>>> >>>>>>>>> Issue 1) Boot fails with CONFIG_PPC_KUAP enabled (enabled by default >>>>>>>>> since 9f5bd8f1471d7 >>>>>>>>> "powerpc/32s: Activate KUAP and KUEP by default"). The reason is a >>>>>>>>> number of of_iomap() calls in >>>>>>>>> arch/powerpc/platforms/52xx that should be early_ioremap(). >>>>>>> >>>>>>> Can you give more details and what leads you to that conclusion ? >>>>>>> >>>>>>> There should be no relation between KUAP and of_iomap()/early_ioremap(). >>>>>>> Problem is likely somewhere else. >>>>> >>>>> You are entirely right, the warnings about early_ioremap() were a red >>>>> hering. I can't reproduce any >>>>> difference in boot behavior anymore I thought I was seeing when changing >>>>> the of_iomap() to >>>>> early_ioremap(). I assume I got confused by testing for too many >>>>> variables at once (kernel version + >>>>> 2 Kconfig settings). >>>>> >>>>> >>>>>>> >>>>>>>>> >>>>>>>>> I can fix this up easy enough for mpc5200_simple by changing >>>>>>>>> mpc5200_setup_xlb_arbiter() to use >>>>>>>>> early_ioremap() and moving mpc52xx_map_common_devices() from the >>>>>>>>> setup_arch to the init hook (one >>>>>>>>> side effect is that mpc52xx_restart() only works a bit later, as it >>>>>>>>> requires the mpc52xx_wdt mapping >>>>>>>>> from mpc52xx_map_common_devices(); I'm not sure if there is a better >>>>>>>>> solution). >>>>>>>>> >>>>>>>>> For the other 52xx platforms (efika, lite5200, media5200) things are >>>>>>>>> a bit more chaotic, and they >>>>>>>>> create several more temporary mappings from setup_arch. Either they >>>>>>>>> should all be moved to the init >>>>>>>>> hook as well, or be converted to early_ioremap(), but I can't tell >>>>>>>>> which is more appropriate. As a >>>>>>>>> first step, I would propose a patch that fixes this for the simple >>>>>>>>> platforms and leaves the other >>>>>>>>> ones unchanged. >>>>>>>>> >>>>>>>>> (Side note: I also found that before 16132529cee58 ("powerpc/32s: >>>>>>>>> Rework Kernel Userspace Access >>>>>>>>> Protection"), boot would succeed even with KUAP enabled without >>>>>>>>> changing the incorrect of_iomap(); I >>>>>>>>> guess the old implementation was more lenient about the incorrect >>>>>>>>> calls that the kernel warns >>>>>>>>> about?) >>>>>>> >>>>>>> Interesting. >>>>>>> Again, there shouldn't be any impact of those incorrect calls. They are >>>>>>> correct calls, it is just an historical method that we want to get rid >>>>>>> of on day. >>>>>>> Could you then provide the dmesg of what/how it works here ? And then >>>>>>> I'd also be interested in a dump of /sys/kernel/debug/kernel_page_tables >>>>>>> and /sys/kernel/debug/powerpc/block_address_translation >>>>>>> and /sys/kernel/debug/powerpc/segment_registers >>>>>>> >>>>>>> For that you'll need CONFIG_PTDUMP_DEBUGFS >>>>> >>>>> As it turns out, whatever issue existed with KUAP at the time when it was >>>>> changed to enabled by >>>>> default for 32s (which was in 5.14) has been resolved in current >>>>> mainline. Current torvalds/master >>>>> boots fine with KUAP enabled, and only CONFIG_STRICT_KERNEL_RWX breaks >>>>> the boot. >>>>> >>>>>>> >>>>>>>>> >>>>>>>>> Issue 2) Boot fails with CONFIG_STRICT_KERNEL_RWX enabled, which is >>>>>>>>> also the default nowadays. >>>>>>>>> >>>>>>>>> I have not found the cause of this boot failure yet; is there any way >>>>>>>>> to debug this if the failure >>>>>>>>> happens before the UART is available and I currently don't have JTAG >>>>>>>>> for this hardware? >>>>>>> >>>>>>> Shouldn't happen before UART is available, strict enforcement is >>>>>>> perfomed by mark_readonly() and free_initmem() in the middle of >>>>>>> kernel_init(). UART should be ON long before that. >>>>>>> >>>>>>> So it must be something in the setup that collides with CONFIG_KUAP and >>>>>>> CONFIG_STRICT_KERNEL_RWX. >>>>>>> >>>>>>> Could you send dmesg of when it works (ie without >>>>>>> CONFIG_KUAP/CONFIG_STRICT_KERNEL_RWX) and when it doesn't work if you >>>>>>> get some initial stuff ? >>>>> >>>>> Here's the UART output of a working boot (CONFIG_STRICT_KERNEL_RWX >>>>> disabled; I have slightly >>>>> extended tqm5200.dts to enable UART output of the cuImage wrapper): >>>>> >>> ... >>>>> >>>>> When boot doesn't work, the last messages I see are from the cuImage >>>>> wrapper ("Finalizing device >>>>> tree... flat tree at ...). The panic is expected, there is no >>>>> rootfs/initramfs in my current setup. >>>>> >>> >>> Ok, so let's focus on CONFIG_STRICT_KERNEL_RWX then. >>> >>> The most efficient would be if you were able to activation your UART >>> console earlier and/or implement some PPC_EARLY_DEBUG stuff to see where >>> it fails. >>> >>> In your dmesg output, "Kernel memory protection not selected by kernel >>> config" is when the strict RWX gets activated when selected. Your UART >>> is enabled before that so if there was a problem with some driver >>> writing in a RO area, it would be seen. >>> >>> One thing that came into my mind is that your CPU may have only 4 BATs >>> instead of 8. But I hacked the definition for the e300c2 CPU and my >>> board still boots with only 4 BATs so it is not that. >>> The thing is that to work properly, BATs should at least cover all >>> kernel. But I built your kernel with your .config and GCC 11.3 and I got >>> something that fits within 8M with the RO part stopping at 4M, so you'll >>> have one 4M BAT set RO, then another 4M BAT set RW, one 8M and one 16M >>> BAT. It won't cover your entire 128M memory but shouldn't be a problem, >>> just less performant. > > Hi Christophe, > > this seems indeed have something to do with the issue. mmu_mapin_ram() > contains a > strict_kernel_rwx_enabled() check that explains the early boot failure (and > as this is a runtime > check, I can actually make the kernel boot by passing rodata=off on the > cmdline!). I've added debug > output show me the addresses in mmu_mapin_ram(): base=00000000 top=08000000 > border=00400000. > Modifying mmu_mapin_ram() to always use the !strict_kernel_rwx_enabled() path > makes the kernel boot > until mark_readonly(). > > Removing MMU_FTR_USE_HIGH_BATS from mmu_features or changing find_free_bat() > to only use 4 BATs > regardless of MMU_FTR_USE_HIGH_BATS results in a working kernel, but it is > unclear to me why that > would be necessary, as the MPC5200B manual clearly states that it has 8. >
Maybe just because something is wrong in the way we set-up BATs. Would be great if you could debug print with a bit more details what happens inside __mmu_mapin_ram(), that is idx and base and size for every call to setbat(). Maybe you can boot to a later point by removing the LOAD_BAT() in function load_up_mmu() in arch/powerpc/kernel/head_book3s_32.S if it helps you print stuff. Then maybe you can print and check the contents of the BATS table. Christophe