Hi Michael & Paolo, On Fri, Apr 05, 2024 at 08:30:43PM +0300, Michael Tokarev wrote: > Date: Fri, 5 Apr 2024 20:30:43 +0300 > From: Michael Tokarev <m...@tls.msk.ru> > Subject: Re: [PATCH] target/i386: fix direction of "32-bit MMU" test > > 01.04.2024 09:02, Michael Tokarev: > > > Anyone can guess why this rather trivial and obviously correct patch causes > > segfaults > > in a few tests in staging-7.2 - when run in tcg mode, namely: > > > > pxe-test > > migration-test > > boot-serial-test > > bios-tables-test > > vmgenid-test > > cdrom-test > > > > When reverting this single commit from staging-7.2, it all works fine again. > > It sigsegvs in probe_access_internal(): > > CPUTLBEntry *entry = tlb_entry(env, mmu_idx, addr); -- this one returns > NULL, > > and next there's a call > > tlb_addr = tlb_read_ofs(entry, elt_ofs); > > which fails. > > #0 0x0000555555c5de8a in tlb_read_ofs (ofs=8, entry=0x0) at > 7.2/accel/tcg/cputlb.c:1455 > #1 probe_access_internal > (env=0x555556a862a0, addr=4294967280, fault_size=fault_size@entry=1, > access_type=access_type@entry=MMU_INST_FETCH, mmu_idx=5, > nonfault=nonfault@entry=false, phost=0x7fffea4d32a0, pfull=0x7fffea4d3298, > retaddr=0) > at 7.2/accel/tcg/cputlb.c:1555 > #2 0x0000555555c62aba in get_page_addr_code_hostp > (env=<optimized out>, addr=addr@entry=4294967280, hostp=hostp@entry=0x0) > at 7.2/accel/tcg/cputlb.c:1691 > #3 0x0000555555c52b54 in get_page_addr_code (addr=4294967280, env=<optimized > out>) > at 7.2/include/exec/exec-all.h:714 > #4 tb_htable_lookup > (cpu=cpu@entry=0x555556a85530, pc=pc@entry=4294967280, > cs_base=cs_base@entry=4294901760, flags=flags@entry=64, > cflags=cflags@entry=4278190080) at 7.2/accel/tcg/cpu-exec.c:236 > #5 0x0000555555c53e8e in tb_lookup > (cflags=4278190080, flags=64, cs_base=4294901760, pc=4294967280, > cpu=0x555556a85530) > at 7.2/accel/tcg/cpu-exec.c:270 > #6 cpu_exec (cpu=cpu@entry=0x555556a85530) at 7.2/accel/tcg/cpu-exec.c:1001 > #7 0x0000555555c75d2f in tcg_cpus_exec (cpu=cpu@entry=0x555556a85530) > at 7.2/accel/tcg/tcg-accel-ops.c:69 > #8 0x0000555555c75e80 in mttcg_cpu_thread_fn (arg=arg@entry=0x555556a85530) > at 7.2/accel/tcg/tcg-accel-ops-mttcg.c:95 > #9 0x0000555555ded098 in qemu_thread_start (args=0x555556adac40) > at 7.2/util/qemu-thread-posix.c:505 > #10 0x00007ffff5793134 in start_thread (arg=<optimized out>) > #11 0x00007ffff58137dc in clone3 () >
I debugged it manually, and found the problem occurs in tlb_index() with mmu_idx=5. For v7.2, the maximum mmu index supported by i386 is 4 (since NB_MMU_MODES = 5 defined in target/i386/cpu-param.h). On Michael's 7.2-i386-mmu-idx tree, the commit 9fc3a7828d25 ("target/i386: use separate MMU indexes for 32-bit accesses") introduced more indexes without relaxing the NB_MMU_MODES for i386. Before this fix, probe_access_internal() just got the wrong mmu_idx as 4, and it's not out of bounds. After this fix, the right mmu_idx=5 is truly out of bounds. On the master branch, there's no such issue since the commits ffd824f3f32d ("include/exec: Set default NB_MMU_MODES to 16") and 6787318a5d86 ("target/i386: Remove NB_MMU_MODES define") relaxed upper limit of MMU index for i386. So these 2 commits should also be picked (with these 2 commits, tests in "make check" passed). However, the cleanup for NB_MMU_MODES is a big series (total 23 patches, from the commit ffd824f3f32d ("include/exec: Set default NB_MMU_MODES to 16") to 00da6b49a227 ("include/exec: Remove guards around NB_MMU_MODES")), maybe changes in other arches should be picked together? In addition, I think maybe we could add assert() in tlb_index() and tlb_entry() to ensure the mmu_idx not exceed the limit of NB_MMU_MODES. But I'm a little unsure if this will hurt performance, because these 2 helpers look like the hotspot functions. -Zhao