On 9/7/2023 11:46 PM, Anup Patel wrote: > On Tue, Aug 1, 2023 at 4:16 AM Daniel Henrique Barboza > <dbarb...@ventanamicro.com> wrote: >> >> >> >> On 7/30/23 22:53, Fei Wu wrote: >>> riscv virt platform's memory started at 0x80000000 and >>> straddled the 4GiB boundary. Curiously enough, this choice >>> of a memory layout will prevent from launching a VM with >>> a bit more than 2000MiB and PCIe pass-thru on an x86 host, due >>> to identity mapping requirements for the MSI doorbell on x86, >>> and these (APIC/IOAPIC) live right below 4GiB. >>> >>> So just split the RAM range into two portions: >>> - 1 GiB range from 0x80000000 to 0xc0000000. >>> - The remainder at 0x100000000 >>> >>> ...leaving a hole between the ranges. >> >> I am afraid this breaks some existing distro setups, like Ubuntu. After this >> patch >> this emulation stopped working: >> >> ~/work/qemu/build/qemu-system-riscv64 \ >> -machine virt -nographic -m 8G -smp 8 \ >> -kernel ./uboot-ubuntu/usr/lib/u-boot/qemu-riscv64_smode/uboot.elf \ >> -drive file=snapshot.img,format=qcow2,if=virtio \ >> -netdev bridge,id=bridge1,br=virbr0 -device >> virtio-net-pci,netdev=bridge1 >> >> >> This is basically a guest created via the official Canonical tutorial: >> >> https://wiki.ubuntu.com/RISC-V/QEMU >> >> The error being thrown: >> >> ================= >> >> Boot HART ID : 4 >> Boot HART Domain : root >> Boot HART Priv Version : v1.12 >> Boot HART Base ISA : rv64imafdch >> Boot HART ISA Extensions : time,sstc >> Boot HART PMP Count : 16 >> Boot HART PMP Granularity : 4 >> Boot HART PMP Address Bits: 54 >> Boot HART MHPM Count : 16 >> Boot HART MIDELEG : 0x0000000000001666 >> Boot HART MEDELEG : 0x0000000000f0b509 >> >> >> U-Boot 2022.07+dfsg-1ubuntu4.2 (Nov 24 2022 - 18:47:41 +0000) >> >> CPU: >> rv64imafdch_zicbom_zicboz_zicsr_zifencei_zihintpause_zawrs_zfa_zca_zcd_zba_zbb_zbc_zbs_sstc_svadu >> Model: riscv-virtio,qemu >> DRAM: Unhandled exception: Store/AMO access fault >> EPC: 00000000802018b8 RA: 00000000802126a0 TVAL: 00000000ff733f90 >> >> Code: b823 06b2 bc23 06b2 b023 08b2 b423 08b2 (b823 08b2) >> >> >> resetting ... >> System reset not supported on this platform >> ### ERROR ### Please RESET the board ### >> QEMU 8.0.90 monitor - type 'help' for more infor >> ================= > > Can you try again after setting CONFIG_NR_DRAM_BANKS=2 in > qemu-riscv64_smode_defconfig and qemu-riscv64_spl_defconfig ? > Yes, I made a u-boot patch to change this setting and also use fdtdec_setup_mem_size_base_lowest() instead fdtdec_setup_mem_size_base() in dram_init(), the latter is also necessary. The patch has been posted to u-boot mailing list but got no reply yet: https://lists.denx.de/pipermail/u-boot/2023-September/529729.html
Thanks, Fei. > Regards, > Anup > >> >> >> Based on the change made I can make an educated guess on what is going wrong. >> We have another board with a similar memory topology you're making here, the >> Microchip Polarfire (microchip_pfsoc.c). We were having some problems with >> this >> board while trying to consolidate the boot process between all boards in >> hw/riscv/boot.c because of its non-continuous RAM bank. The full story can be >> read in the commit message of 4b402886ac89 ("hw/riscv: change >> riscv_compute_fdt_addr() >> semantics") but the short version can be seen in riscv_compute_fdt_addr() >> from boot.c: >> >> - if ram_start is less than 3072MiB, the FDT will be put at the lowest >> value >> between 3072 MiB and the end of that RAM bank; >> >> - if ram_start is higher than 3072 MiB the FDT will be put at the end of the >> RAM bank. >> >> So, after this patch, since riscv_compute_fdt_addr() is being used with the >> now >> lower RAM bank, the fdt is being put in LOW_MEM - fdt_size for any setup >> that has >> more than 1Gb RAM, and this breaks assumptions made by uboot and Ubuntu and >> possibly >> others that are trying to retrieve the FDT from the gap that you created >> between >> low and hi mem in this patch. >> >> In fact, this same Ubuntu guest I mentioned above will boot if I put only 1 >> Gb of RAM >> (-m 1Gb). If I try with -m 1.1Gb I reproduce this error. This can be a >> validation of >> the guess I'm making here: Ubuntu is trying to fetch stuff (probably the >> fdt) from >> the gap between the memory areas. >> >> This change on top of this patch doesn't work either: >> >> $ git diff >> diff --git a/hw/riscv/virt.c b/hw/riscv/virt.c >> index 8fbdc7220c..dfff48d849 100644 >> --- a/hw/riscv/virt.c >> +++ b/hw/riscv/virt.c >> @@ -1335,9 +1335,16 @@ static void virt_machine_done(Notifier *notifier, >> void *data) >> kernel_start_addr, true, NULL); >> } >> >> - fdt_load_addr = riscv_compute_fdt_addr(memmap[VIRT_DRAM].base, >> + if (machine->ram_size < memmap[VIRT_DRAM].size) { >> + fdt_load_addr = riscv_compute_fdt_addr(memmap[VIRT_DRAM].base, >> memmap[VIRT_DRAM].size, >> machine); >> + } else { >> + fdt_load_addr = riscv_compute_fdt_addr(memmap[VIRT_DRAM_HIGH].base, >> + memmap[VIRT_DRAM_HIGH].size, >> + machine); >> + } >> + >> >> >> This would put the fdt at the end of the HI RAM for guests with more than >> 1Gb of RAM. >> This change in fact makes the situation even worse, breaking setups that >> were working >> before with this patch. >> >> There's a chance that reducing the gap between the RAM banks can make Ubuntu >> work >> reliably again but now I'm a little cold feet with this change. >> >> >> I think we'll need some kernel/Opensbi folks to weight in here to see if >> there's a >> failsafe memory setup that won't break distros out there and allow your >> passthrough >> to work. >> >> >> >> Thanks, >> >> Daniel >> >>> >>> Signed-off-by: Andrei Warkentin <andrei.warken...@intel.com> >>> Signed-off-by: Fei Wu <fei2...@intel.com> >>> --- >>> hw/riscv/virt.c | 74 ++++++++++++++++++++++++++++++++++++----- >>> include/hw/riscv/virt.h | 1 + >>> 2 files changed, 66 insertions(+), 9 deletions(-) >>> >>> diff --git a/hw/riscv/virt.c b/hw/riscv/virt.c >>> index d90286dc46..8fbdc7220c 100644 >>> --- a/hw/riscv/virt.c >>> +++ b/hw/riscv/virt.c >>> @@ -75,7 +75,9 @@ >>> #error "Can't accomodate all IMSIC groups in address space" >>> #endif >>> >>> -static const MemMapEntry virt_memmap[] = { >>> +#define LOW_MEM (1 * GiB) >>> + >>> +static MemMapEntry virt_memmap[] = { >>> [VIRT_DEBUG] = { 0x0, 0x100 }, >>> [VIRT_MROM] = { 0x1000, 0xf000 }, >>> [VIRT_TEST] = { 0x100000, 0x1000 }, >>> @@ -96,6 +98,7 @@ static const MemMapEntry virt_memmap[] = { >>> [VIRT_PCIE_ECAM] = { 0x30000000, 0x10000000 }, >>> [VIRT_PCIE_MMIO] = { 0x40000000, 0x40000000 }, >>> [VIRT_DRAM] = { 0x80000000, 0x0 }, >>> + [VIRT_DRAM_HIGH] = { 0x100000000, 0x0 }, >>> }; >>> >>> /* PCIe high mmio is fixed for RV32 */ >>> @@ -295,15 +298,12 @@ static void create_fdt_socket_cpus(RISCVVirtState *s, >>> int socket, >>> } >>> } >>> >>> -static void create_fdt_socket_memory(RISCVVirtState *s, >>> - const MemMapEntry *memmap, int socket) >>> +static void create_fdt_socket_mem_range(RISCVVirtState *s, uint64_t addr, >>> + uint64_t size, int socket) >>> { >>> char *mem_name; >>> - uint64_t addr, size; >>> MachineState *ms = MACHINE(s); >>> >>> - addr = memmap[VIRT_DRAM].base + riscv_socket_mem_offset(ms, socket); >>> - size = riscv_socket_mem_size(ms, socket); >>> mem_name = g_strdup_printf("/memory@%lx", (long)addr); >>> qemu_fdt_add_subnode(ms->fdt, mem_name); >>> qemu_fdt_setprop_cells(ms->fdt, mem_name, "reg", >>> @@ -313,6 +313,34 @@ static void create_fdt_socket_memory(RISCVVirtState *s, >>> g_free(mem_name); >>> } >>> >>> +static void create_fdt_socket_memory(RISCVVirtState *s, >>> + const MemMapEntry *memmap, int socket) >>> +{ >>> + uint64_t addr, size; >>> + MachineState *mc = MACHINE(s); >>> + uint64_t sock_offset = riscv_socket_mem_offset(mc, socket); >>> + uint64_t sock_size = riscv_socket_mem_size(mc, socket); >>> + >>> + if (sock_offset < memmap[VIRT_DRAM].size) { >>> + uint64_t low_mem_end = memmap[VIRT_DRAM].base + >>> memmap[VIRT_DRAM].size; >>> + >>> + addr = memmap[VIRT_DRAM].base + sock_offset; >>> + size = MIN(low_mem_end - addr, sock_size); >>> + create_fdt_socket_mem_range(s, addr, size, socket); >>> + >>> + size = sock_size - size; >>> + if (size > 0) { >>> + create_fdt_socket_mem_range(s, memmap[VIRT_DRAM_HIGH].base, >>> + size, socket); >>> + } >>> + } else { >>> + addr = memmap[VIRT_DRAM_HIGH].base + >>> + sock_offset - memmap[VIRT_DRAM].size; >>> + >>> + create_fdt_socket_mem_range(s, addr, sock_size, socket); >>> + } >>> +} >>> + >>> static void create_fdt_socket_clint(RISCVVirtState *s, >>> const MemMapEntry *memmap, int socket, >>> uint32_t *intc_phandles) >>> @@ -1334,10 +1362,12 @@ static void virt_machine_done(Notifier *notifier, >>> void *data) >>> >>> static void virt_machine_init(MachineState *machine) >>> { >>> - const MemMapEntry *memmap = virt_memmap; >>> + MemMapEntry *memmap = virt_memmap; >>> RISCVVirtState *s = RISCV_VIRT_MACHINE(machine); >>> MemoryRegion *system_memory = get_system_memory(); >>> MemoryRegion *mask_rom = g_new(MemoryRegion, 1); >>> + MemoryRegion *ram_below_4g, *ram_above_4g; >>> + uint64_t ram_size_low, ram_size_high; >>> char *soc_name; >>> DeviceState *mmio_irqchip, *virtio_irqchip, *pcie_irqchip; >>> int i, base_hartid, hart_count; >>> @@ -1448,6 +1478,17 @@ static void virt_machine_init(MachineState *machine) >>> } >>> } >>> >>> + if (machine->ram_size > LOW_MEM) { >>> + ram_size_high = machine->ram_size - LOW_MEM; >>> + ram_size_low = LOW_MEM; >>> + } else { >>> + ram_size_high = 0; >>> + ram_size_low = machine->ram_size; >>> + } >>> + >>> + memmap[VIRT_DRAM].size = ram_size_low; >>> + memmap[VIRT_DRAM_HIGH].size = ram_size_high; >>> + >>> if (riscv_is_32bit(&s->soc[0])) { >>> #if HOST_LONG_BITS == 64 >>> /* limit RAM size in a 32-bit system */ >>> @@ -1460,7 +1501,8 @@ static void virt_machine_init(MachineState *machine) >>> virt_high_pcie_memmap.size = VIRT32_HIGH_PCIE_MMIO_SIZE; >>> } else { >>> virt_high_pcie_memmap.size = VIRT64_HIGH_PCIE_MMIO_SIZE; >>> - virt_high_pcie_memmap.base = memmap[VIRT_DRAM].base + >>> machine->ram_size; >>> + virt_high_pcie_memmap.base = memmap[VIRT_DRAM_HIGH].base + >>> + memmap[VIRT_DRAM_HIGH].size; >>> virt_high_pcie_memmap.base = >>> ROUND_UP(virt_high_pcie_memmap.base, >>> virt_high_pcie_memmap.size); >>> } >>> @@ -1468,8 +1510,22 @@ static void virt_machine_init(MachineState *machine) >>> s->memmap = virt_memmap; >>> >>> /* register system main memory (actual RAM) */ >>> + ram_below_4g = g_malloc(sizeof(*ram_below_4g)); >>> + memory_region_init_alias(ram_below_4g, NULL, "ram-below-4g", >>> machine->ram, >>> + 0, memmap[VIRT_DRAM].size); >>> memory_region_add_subregion(system_memory, memmap[VIRT_DRAM].base, >>> - machine->ram); >>> + ram_below_4g); >>> + >>> + if (memmap[VIRT_DRAM_HIGH].size) { >>> + ram_above_4g = g_malloc(sizeof(*ram_above_4g)); >>> + memory_region_init_alias(ram_above_4g, NULL, "ram-above-4g", >>> + machine->ram, >>> + memmap[VIRT_DRAM].size, >>> + memmap[VIRT_DRAM_HIGH].size); >>> + memory_region_add_subregion(system_memory, >>> + memmap[VIRT_DRAM_HIGH].base, >>> + ram_above_4g); >>> + } >>> >>> /* boot rom */ >>> memory_region_init_rom(mask_rom, NULL, "riscv_virt_board.mrom", >>> diff --git a/include/hw/riscv/virt.h b/include/hw/riscv/virt.h >>> index e5c474b26e..36004eb6ef 100644 >>> --- a/include/hw/riscv/virt.h >>> +++ b/include/hw/riscv/virt.h >>> @@ -79,6 +79,7 @@ enum { >>> VIRT_IMSIC_S, >>> VIRT_FLASH, >>> VIRT_DRAM, >>> + VIRT_DRAM_HIGH, >>> VIRT_PCIE_MMIO, >>> VIRT_PCIE_PIO, >>> VIRT_PLATFORM_BUS, >>