On Tue, Aug 1, 2023 at 4:16 AM Daniel Henrique Barboza <dbarb...@ventanamicro.com> wrote: > > > > On 7/30/23 22:53, Fei Wu wrote: > > riscv virt platform's memory started at 0x80000000 and > > straddled the 4GiB boundary. Curiously enough, this choice > > of a memory layout will prevent from launching a VM with > > a bit more than 2000MiB and PCIe pass-thru on an x86 host, due > > to identity mapping requirements for the MSI doorbell on x86, > > and these (APIC/IOAPIC) live right below 4GiB. > > > > So just split the RAM range into two portions: > > - 1 GiB range from 0x80000000 to 0xc0000000. > > - The remainder at 0x100000000 > > > > ...leaving a hole between the ranges. > > I am afraid this breaks some existing distro setups, like Ubuntu. After this > patch > this emulation stopped working: > > ~/work/qemu/build/qemu-system-riscv64 \ > -machine virt -nographic -m 8G -smp 8 \ > -kernel ./uboot-ubuntu/usr/lib/u-boot/qemu-riscv64_smode/uboot.elf \ > -drive file=snapshot.img,format=qcow2,if=virtio \ > -netdev bridge,id=bridge1,br=virbr0 -device > virtio-net-pci,netdev=bridge1 > > > This is basically a guest created via the official Canonical tutorial: > > https://wiki.ubuntu.com/RISC-V/QEMU > > The error being thrown: > > ================= > > Boot HART ID : 4 > Boot HART Domain : root > Boot HART Priv Version : v1.12 > Boot HART Base ISA : rv64imafdch > Boot HART ISA Extensions : time,sstc > Boot HART PMP Count : 16 > Boot HART PMP Granularity : 4 > Boot HART PMP Address Bits: 54 > Boot HART MHPM Count : 16 > Boot HART MIDELEG : 0x0000000000001666 > Boot HART MEDELEG : 0x0000000000f0b509 > > > U-Boot 2022.07+dfsg-1ubuntu4.2 (Nov 24 2022 - 18:47:41 +0000) > > CPU: > rv64imafdch_zicbom_zicboz_zicsr_zifencei_zihintpause_zawrs_zfa_zca_zcd_zba_zbb_zbc_zbs_sstc_svadu > Model: riscv-virtio,qemu > DRAM: Unhandled exception: Store/AMO access fault > EPC: 00000000802018b8 RA: 00000000802126a0 TVAL: 00000000ff733f90 > > Code: b823 06b2 bc23 06b2 b023 08b2 b423 08b2 (b823 08b2) > > > resetting ... > System reset not supported on this platform > ### ERROR ### Please RESET the board ### > QEMU 8.0.90 monitor - type 'help' for more infor > =================
Can you try again after setting CONFIG_NR_DRAM_BANKS=2 in qemu-riscv64_smode_defconfig and qemu-riscv64_spl_defconfig ? Regards, Anup > > > Based on the change made I can make an educated guess on what is going wrong. > We have another board with a similar memory topology you're making here, the > Microchip Polarfire (microchip_pfsoc.c). We were having some problems with > this > board while trying to consolidate the boot process between all boards in > hw/riscv/boot.c because of its non-continuous RAM bank. The full story can be > read in the commit message of 4b402886ac89 ("hw/riscv: change > riscv_compute_fdt_addr() > semantics") but the short version can be seen in riscv_compute_fdt_addr() > from boot.c: > > - if ram_start is less than 3072MiB, the FDT will be put at the lowest > value > between 3072 MiB and the end of that RAM bank; > > - if ram_start is higher than 3072 MiB the FDT will be put at the end of the > RAM bank. > > So, after this patch, since riscv_compute_fdt_addr() is being used with the > now > lower RAM bank, the fdt is being put in LOW_MEM - fdt_size for any setup that > has > more than 1Gb RAM, and this breaks assumptions made by uboot and Ubuntu and > possibly > others that are trying to retrieve the FDT from the gap that you created > between > low and hi mem in this patch. > > In fact, this same Ubuntu guest I mentioned above will boot if I put only 1 > Gb of RAM > (-m 1Gb). If I try with -m 1.1Gb I reproduce this error. This can be a > validation of > the guess I'm making here: Ubuntu is trying to fetch stuff (probably the fdt) > from > the gap between the memory areas. > > This change on top of this patch doesn't work either: > > $ git diff > diff --git a/hw/riscv/virt.c b/hw/riscv/virt.c > index 8fbdc7220c..dfff48d849 100644 > --- a/hw/riscv/virt.c > +++ b/hw/riscv/virt.c > @@ -1335,9 +1335,16 @@ static void virt_machine_done(Notifier *notifier, void > *data) > kernel_start_addr, true, NULL); > } > > - fdt_load_addr = riscv_compute_fdt_addr(memmap[VIRT_DRAM].base, > + if (machine->ram_size < memmap[VIRT_DRAM].size) { > + fdt_load_addr = riscv_compute_fdt_addr(memmap[VIRT_DRAM].base, > memmap[VIRT_DRAM].size, > machine); > + } else { > + fdt_load_addr = riscv_compute_fdt_addr(memmap[VIRT_DRAM_HIGH].base, > + memmap[VIRT_DRAM_HIGH].size, > + machine); > + } > + > > > This would put the fdt at the end of the HI RAM for guests with more than 1Gb > of RAM. > This change in fact makes the situation even worse, breaking setups that were > working > before with this patch. > > There's a chance that reducing the gap between the RAM banks can make Ubuntu > work > reliably again but now I'm a little cold feet with this change. > > > I think we'll need some kernel/Opensbi folks to weight in here to see if > there's a > failsafe memory setup that won't break distros out there and allow your > passthrough > to work. > > > > Thanks, > > Daniel > > > > > Signed-off-by: Andrei Warkentin <andrei.warken...@intel.com> > > Signed-off-by: Fei Wu <fei2...@intel.com> > > --- > > hw/riscv/virt.c | 74 ++++++++++++++++++++++++++++++++++++----- > > include/hw/riscv/virt.h | 1 + > > 2 files changed, 66 insertions(+), 9 deletions(-) > > > > diff --git a/hw/riscv/virt.c b/hw/riscv/virt.c > > index d90286dc46..8fbdc7220c 100644 > > --- a/hw/riscv/virt.c > > +++ b/hw/riscv/virt.c > > @@ -75,7 +75,9 @@ > > #error "Can't accomodate all IMSIC groups in address space" > > #endif > > > > -static const MemMapEntry virt_memmap[] = { > > +#define LOW_MEM (1 * GiB) > > + > > +static MemMapEntry virt_memmap[] = { > > [VIRT_DEBUG] = { 0x0, 0x100 }, > > [VIRT_MROM] = { 0x1000, 0xf000 }, > > [VIRT_TEST] = { 0x100000, 0x1000 }, > > @@ -96,6 +98,7 @@ static const MemMapEntry virt_memmap[] = { > > [VIRT_PCIE_ECAM] = { 0x30000000, 0x10000000 }, > > [VIRT_PCIE_MMIO] = { 0x40000000, 0x40000000 }, > > [VIRT_DRAM] = { 0x80000000, 0x0 }, > > + [VIRT_DRAM_HIGH] = { 0x100000000, 0x0 }, > > }; > > > > /* PCIe high mmio is fixed for RV32 */ > > @@ -295,15 +298,12 @@ static void create_fdt_socket_cpus(RISCVVirtState *s, > > int socket, > > } > > } > > > > -static void create_fdt_socket_memory(RISCVVirtState *s, > > - const MemMapEntry *memmap, int socket) > > +static void create_fdt_socket_mem_range(RISCVVirtState *s, uint64_t addr, > > + uint64_t size, int socket) > > { > > char *mem_name; > > - uint64_t addr, size; > > MachineState *ms = MACHINE(s); > > > > - addr = memmap[VIRT_DRAM].base + riscv_socket_mem_offset(ms, socket); > > - size = riscv_socket_mem_size(ms, socket); > > mem_name = g_strdup_printf("/memory@%lx", (long)addr); > > qemu_fdt_add_subnode(ms->fdt, mem_name); > > qemu_fdt_setprop_cells(ms->fdt, mem_name, "reg", > > @@ -313,6 +313,34 @@ static void create_fdt_socket_memory(RISCVVirtState *s, > > g_free(mem_name); > > } > > > > +static void create_fdt_socket_memory(RISCVVirtState *s, > > + const MemMapEntry *memmap, int socket) > > +{ > > + uint64_t addr, size; > > + MachineState *mc = MACHINE(s); > > + uint64_t sock_offset = riscv_socket_mem_offset(mc, socket); > > + uint64_t sock_size = riscv_socket_mem_size(mc, socket); > > + > > + if (sock_offset < memmap[VIRT_DRAM].size) { > > + uint64_t low_mem_end = memmap[VIRT_DRAM].base + > > memmap[VIRT_DRAM].size; > > + > > + addr = memmap[VIRT_DRAM].base + sock_offset; > > + size = MIN(low_mem_end - addr, sock_size); > > + create_fdt_socket_mem_range(s, addr, size, socket); > > + > > + size = sock_size - size; > > + if (size > 0) { > > + create_fdt_socket_mem_range(s, memmap[VIRT_DRAM_HIGH].base, > > + size, socket); > > + } > > + } else { > > + addr = memmap[VIRT_DRAM_HIGH].base + > > + sock_offset - memmap[VIRT_DRAM].size; > > + > > + create_fdt_socket_mem_range(s, addr, sock_size, socket); > > + } > > +} > > + > > static void create_fdt_socket_clint(RISCVVirtState *s, > > const MemMapEntry *memmap, int socket, > > uint32_t *intc_phandles) > > @@ -1334,10 +1362,12 @@ static void virt_machine_done(Notifier *notifier, > > void *data) > > > > static void virt_machine_init(MachineState *machine) > > { > > - const MemMapEntry *memmap = virt_memmap; > > + MemMapEntry *memmap = virt_memmap; > > RISCVVirtState *s = RISCV_VIRT_MACHINE(machine); > > MemoryRegion *system_memory = get_system_memory(); > > MemoryRegion *mask_rom = g_new(MemoryRegion, 1); > > + MemoryRegion *ram_below_4g, *ram_above_4g; > > + uint64_t ram_size_low, ram_size_high; > > char *soc_name; > > DeviceState *mmio_irqchip, *virtio_irqchip, *pcie_irqchip; > > int i, base_hartid, hart_count; > > @@ -1448,6 +1478,17 @@ static void virt_machine_init(MachineState *machine) > > } > > } > > > > + if (machine->ram_size > LOW_MEM) { > > + ram_size_high = machine->ram_size - LOW_MEM; > > + ram_size_low = LOW_MEM; > > + } else { > > + ram_size_high = 0; > > + ram_size_low = machine->ram_size; > > + } > > + > > + memmap[VIRT_DRAM].size = ram_size_low; > > + memmap[VIRT_DRAM_HIGH].size = ram_size_high; > > + > > if (riscv_is_32bit(&s->soc[0])) { > > #if HOST_LONG_BITS == 64 > > /* limit RAM size in a 32-bit system */ > > @@ -1460,7 +1501,8 @@ static void virt_machine_init(MachineState *machine) > > virt_high_pcie_memmap.size = VIRT32_HIGH_PCIE_MMIO_SIZE; > > } else { > > virt_high_pcie_memmap.size = VIRT64_HIGH_PCIE_MMIO_SIZE; > > - virt_high_pcie_memmap.base = memmap[VIRT_DRAM].base + > > machine->ram_size; > > + virt_high_pcie_memmap.base = memmap[VIRT_DRAM_HIGH].base + > > + memmap[VIRT_DRAM_HIGH].size; > > virt_high_pcie_memmap.base = > > ROUND_UP(virt_high_pcie_memmap.base, > > virt_high_pcie_memmap.size); > > } > > @@ -1468,8 +1510,22 @@ static void virt_machine_init(MachineState *machine) > > s->memmap = virt_memmap; > > > > /* register system main memory (actual RAM) */ > > + ram_below_4g = g_malloc(sizeof(*ram_below_4g)); > > + memory_region_init_alias(ram_below_4g, NULL, "ram-below-4g", > > machine->ram, > > + 0, memmap[VIRT_DRAM].size); > > memory_region_add_subregion(system_memory, memmap[VIRT_DRAM].base, > > - machine->ram); > > + ram_below_4g); > > + > > + if (memmap[VIRT_DRAM_HIGH].size) { > > + ram_above_4g = g_malloc(sizeof(*ram_above_4g)); > > + memory_region_init_alias(ram_above_4g, NULL, "ram-above-4g", > > + machine->ram, > > + memmap[VIRT_DRAM].size, > > + memmap[VIRT_DRAM_HIGH].size); > > + memory_region_add_subregion(system_memory, > > + memmap[VIRT_DRAM_HIGH].base, > > + ram_above_4g); > > + } > > > > /* boot rom */ > > memory_region_init_rom(mask_rom, NULL, "riscv_virt_board.mrom", > > diff --git a/include/hw/riscv/virt.h b/include/hw/riscv/virt.h > > index e5c474b26e..36004eb6ef 100644 > > --- a/include/hw/riscv/virt.h > > +++ b/include/hw/riscv/virt.h > > @@ -79,6 +79,7 @@ enum { > > VIRT_IMSIC_S, > > VIRT_FLASH, > > VIRT_DRAM, > > + VIRT_DRAM_HIGH, > > VIRT_PCIE_MMIO, > > VIRT_PCIE_PIO, > > VIRT_PLATFORM_BUS, >