On Fri, Feb 02, 2024 at 04:33:20PM +0000, Peter Maydell wrote: > On Fri, 2 Feb 2024 at 16:26, Jonathan Cameron > <jonathan.came...@huawei.com> wrote: > > #7 0x0000555555ab1929 in bql_lock_impl (file=0x555556049122 > > "../../accel/tcg/cputlb.c", line=2033) at ../../system/cpus.c:524 > > #8 bql_lock_impl (file=file@entry=0x555556049122 > > "../../accel/tcg/cputlb.c", line=line@entry=2033) at ../../system/cpus.c:520 > > #9 0x0000555555c9f7d6 in do_ld_mmio_beN (cpu=0x5555578e0cb0, > > full=0x7ffe88012950, ret_be=ret_be@entry=0, addr=19595792376, > > size=size@entry=8, mmu_idx=4, type=MMU_DATA_LOAD, ra=0) at > > ../../accel/tcg/cputlb.c:2033 > > #10 0x0000555555ca0fbd in do_ld_8 (cpu=cpu@entry=0x5555578e0cb0, > > p=p@entry=0x7ffff4efd1d0, mmu_idx=<optimized out>, > > type=type@entry=MMU_DATA_LOAD, memop=<optimized out>, ra=ra@entry=0) at > > ../../accel/tcg/cputlb.c:2356 > > #11 0x0000555555ca341f in do_ld8_mmu (cpu=cpu@entry=0x5555578e0cb0, > > addr=addr@entry=19595792376, oi=oi@entry=52, ra=0, ra@entry=52, > > access_type=access_type@entry=MMU_DATA_LOAD) at > > ../../accel/tcg/cputlb.c:2439 > > #12 0x0000555555ca5f59 in cpu_ldq_mmu (ra=52, oi=52, addr=19595792376, > > env=0x5555578e3470) at ../../accel/tcg/ldst_common.c.inc:169 > > #13 cpu_ldq_le_mmuidx_ra (env=0x5555578e3470, addr=19595792376, > > mmu_idx=<optimized out>, ra=ra@entry=0) at > > ../../accel/tcg/ldst_common.c.inc:301 > > #14 0x0000555555b4b5fc in ptw_ldq (ra=0, in=0x7ffff4efd320) at > > ../../target/i386/tcg/sysemu/excp_helper.c:98 > > #15 ptw_ldq (ra=0, in=0x7ffff4efd320) at > > ../../target/i386/tcg/sysemu/excp_helper.c:93 > > #16 mmu_translate (env=env@entry=0x5555578e3470, in=0x7ffff4efd3e0, > > out=0x7ffff4efd3b0, err=err@entry=0x7ffff4efd3c0, ra=ra@entry=0) at > > ../../target/i386/tcg/sysemu/excp_helper.c:174 > > #17 0x0000555555b4c4b3 in get_physical_address (ra=0, err=0x7ffff4efd3c0, > > out=0x7ffff4efd3b0, mmu_idx=0, access_type=MMU_DATA_LOAD, > > addr=18446741874686299840, env=0x5555578e3470) at > > ../../target/i386/tcg/sysemu/excp_helper.c:580 > > #18 x86_cpu_tlb_fill (cs=0x5555578e0cb0, addr=18446741874686299840, > > size=<optimized out>, access_type=MMU_DATA_LOAD, mmu_idx=0, > > probe=<optimized out>, retaddr=0) at > > ../../target/i386/tcg/sysemu/excp_helper.c:606 > > #19 0x0000555555ca0ee9 in tlb_fill (retaddr=0, mmu_idx=0, > > access_type=MMU_DATA_LOAD, size=<optimized out>, addr=18446741874686299840, > > cpu=0x7ffff4efd540) at ../../accel/tcg/cputlb.c:1315 > > #20 mmu_lookup1 (cpu=cpu@entry=0x5555578e0cb0, > > data=data@entry=0x7ffff4efd540, mmu_idx=0, > > access_type=access_type@entry=MMU_DATA_LOAD, ra=ra@entry=0) at > > ../../accel/tcg/cputlb.c:1713 > > Here we are trying to take an interrupt. This isn't related to the > other can_do_io stuff, it's happening because do_ld_mmio_beN assumes > it's called with the BQL not held, but in fact there are some > situations where we call into the memory subsystem and we do > already have the BQL. > > -- PMM
It's bugs all the way down as usual! https://xkcd.com/1416/ I'll dig in a little next week to see if there's an easy fix. We can see the return address is already 0 going into mmu_translate, so it does look unrelated to the patch I threw out - but probably still has to do with things being on IO. ~Gregory