On Thu, 5 May 2022 at 03:31, Alexey Kardashevskiy <a...@ozlabs.ru> wrote: > > > > On 5/5/22 05:16, Fabiano Rosas wrote: > > Alexey Kardashevskiy <a...@ozlabs.ru> writes: > > > >> tl;dr: This allows Big Endian zImage booting via -kernel + x-vof=on. > >> > >> QEMU loads the kernel at 0x400000 by default which works most of > >> the time as Linux kernels are relocatable, 64bit and compiled with "-pie" > >> (position independent code). This works for a little endian zImage too. > >> > >> However a big endian zImage is compiled without -pie, is 32bit, linked to > >> 0x4000000 so current QEMU ends up loading it at > >> 0x4400000 but keeps spapr->kernel_addr unchanged so booting fails. > >> > >> This uses the kernel address returned from load_elf(). > >> If the default kernel_addr is used, there is no change in behavior (as > >> translate_kernel_address() takes care of this), which is: > >> LE/BE vmlinux and LE zImage boot, BE zImage does not. > >> If the VM created with "-machine kernel-addr=0,x-vof=on", then QEMU > >> prints a warning and BE zImage boots. > > > > I think we can fix this without needing a different command line for BE > > zImage (apart from x-vof, which is a separate matter). > > > > If you look at translate_kernel_address, it cannot really work when the > > ELF PhysAddr is != 0. We would always hit this sort of 0x4400000 issue, > > so if we fix that function like this... > > > > static uint64_t translate_kernel_address(void *opaque, uint64_t addr) > > { > > SpaprMachineState *spapr = opaque; > > > > return addr ? addr : spapr->kernel_addr; > > } > > > The qemu elf loader is supposed to handle relocations which should be > calling this hook more than once, now I wonder why it is not doing so. > > > > ...then we could always use the ELF PhysAddr if it is different from 0 > > and only use the default load addr if the ELF PhysAddr is 0. If the user > > gives kernel_addr on the cmdline, we honor that, even if puts the kernel > > over the firmware (we have code to detect that). > > > ELF address is 0 for LE zImage only, vmlinux BE/LE uses > 0xc000000000000000. And we are already chopping all these tops bits off > in translate_kernel_address() and I do not really know why _exactly_ it > is 0x0fffffff and not let's say 0x7fffffff. > > > > > >> @@ -2988,6 +2990,12 @@ static void spapr_machine_init(MachineState > >> *machine) > >> exit(1); > >> } > >> > >> + if (spapr->kernel_addr != loaded_addr) { > > > > This could be: > > > > if (spapr->kernel_addr == KERNEL_LOAD_ADDR && > > spapr->kernel_addr != loaded_addr) { > > > > So the precedence would be: > > > > 1- ELF PhysAddr, if != 0. After all, that is what it's for. BE zImage > > falls here; > > > > 2- KERNEL_LOAD_ADDR. Via translate_kernel_address, LE/BE vmlinux fall > > here; > > > > 3- kernel_addr. The user is probably hacking something, just use what > > they gave us. QEMU will yell if they load the kernel over the fw. > > > imho too complicated. > > What if the user runs QEMU with kernel-addr=0x400000? (0x400000 is > KERNEL_LOAD_ADDR noooow but not necessarily forever). Is it 2) or 3)? > > I am basically fixing a bug when we ignore where load_elf() loaded the > ELF and just assume it is KERNEL_LOAD_ADDR. Now the code checks if the > ELF was loaded where we want it to be. Everything else can be done but > on top of this.
It would be good to fix this so we don't need to specify kernel-addr=0. I only recently learnt the pseries machine doesn't support loading the zImage: https://github.com/linuxppc/issues/issues/402 So whatever the fix is, writing down what is expected to work and what isn't would be useful. I tested your patch and it worked with this command line: qemu-system-ppc64 -M pseries,kernel-addr=0,x-vof=on -nographic -kernel arch/powerpc/boot/zImage.pseries -serial mon:stdio -nodefaults Tested-by: Joel Stanley <j...@jms.id.au> Cheers, Joel > > > >> + warn_report("spapr: kernel_addr changed from 0x%lx to 0x%lx", > >> + spapr->kernel_addr, loaded_addr); > >> + spapr->kernel_addr = loaded_addr; > >> + } > >> + > >> /* load initrd */ > >> if (initrd_filename) { > >> /* Try to locate the initrd in the gap between the kernel >