On Fri, Feb 02, 2024 at 04:33:20PM +0000, Peter Maydell wrote:
> On Fri, 2 Feb 2024 at 16:26, Jonathan Cameron
> <jonathan.came...@huawei.com> wrote:
> > #7  0x0000555555ab1929 in bql_lock_impl (file=0x555556049122 
> > "../../accel/tcg/cputlb.c", line=2033) at ../../system/cpus.c:524
> > #8  bql_lock_impl (file=file@entry=0x555556049122 
> > "../../accel/tcg/cputlb.c", line=line@entry=2033) at ../../system/cpus.c:520
> > #9  0x0000555555c9f7d6 in do_ld_mmio_beN (cpu=0x5555578e0cb0, 
> > full=0x7ffe88012950, ret_be=ret_be@entry=0, addr=19595792376, 
> > size=size@entry=8, mmu_idx=4, type=MMU_DATA_LOAD, ra=0) at 
> > ../../accel/tcg/cputlb.c:2033
> > #10 0x0000555555ca0fbd in do_ld_8 (cpu=cpu@entry=0x5555578e0cb0, 
> > p=p@entry=0x7ffff4efd1d0, mmu_idx=<optimized out>, 
> > type=type@entry=MMU_DATA_LOAD, memop=<optimized out>, ra=ra@entry=0) at 
> > ../../accel/tcg/cputlb.c:2356
> > #11 0x0000555555ca341f in do_ld8_mmu (cpu=cpu@entry=0x5555578e0cb0, 
> > addr=addr@entry=19595792376, oi=oi@entry=52, ra=0, ra@entry=52, 
> > access_type=access_type@entry=MMU_DATA_LOAD) at 
> > ../../accel/tcg/cputlb.c:2439
> > #12 0x0000555555ca5f59 in cpu_ldq_mmu (ra=52, oi=52, addr=19595792376, 
> > env=0x5555578e3470) at ../../accel/tcg/ldst_common.c.inc:169
> > #13 cpu_ldq_le_mmuidx_ra (env=0x5555578e3470, addr=19595792376, 
> > mmu_idx=<optimized out>, ra=ra@entry=0) at 
> > ../../accel/tcg/ldst_common.c.inc:301
> > #14 0x0000555555b4b5fc in ptw_ldq (ra=0, in=0x7ffff4efd320) at 
> > ../../target/i386/tcg/sysemu/excp_helper.c:98
> > #15 ptw_ldq (ra=0, in=0x7ffff4efd320) at 
> > ../../target/i386/tcg/sysemu/excp_helper.c:93
> > #16 mmu_translate (env=env@entry=0x5555578e3470, in=0x7ffff4efd3e0, 
> > out=0x7ffff4efd3b0, err=err@entry=0x7ffff4efd3c0, ra=ra@entry=0) at 
> > ../../target/i386/tcg/sysemu/excp_helper.c:174
> > #17 0x0000555555b4c4b3 in get_physical_address (ra=0, err=0x7ffff4efd3c0, 
> > out=0x7ffff4efd3b0, mmu_idx=0, access_type=MMU_DATA_LOAD, 
> > addr=18446741874686299840, env=0x5555578e3470) at 
> > ../../target/i386/tcg/sysemu/excp_helper.c:580
> > #18 x86_cpu_tlb_fill (cs=0x5555578e0cb0, addr=18446741874686299840, 
> > size=<optimized out>, access_type=MMU_DATA_LOAD, mmu_idx=0, 
> > probe=<optimized out>, retaddr=0) at 
> > ../../target/i386/tcg/sysemu/excp_helper.c:606
> > #19 0x0000555555ca0ee9 in tlb_fill (retaddr=0, mmu_idx=0, 
> > access_type=MMU_DATA_LOAD, size=<optimized out>, addr=18446741874686299840, 
> > cpu=0x7ffff4efd540) at ../../accel/tcg/cputlb.c:1315
> > #20 mmu_lookup1 (cpu=cpu@entry=0x5555578e0cb0, 
> > data=data@entry=0x7ffff4efd540, mmu_idx=0, 
> > access_type=access_type@entry=MMU_DATA_LOAD, ra=ra@entry=0) at 
> > ../../accel/tcg/cputlb.c:1713
> 
> Here we are trying to take an interrupt. This isn't related to the
> other can_do_io stuff, it's happening because do_ld_mmio_beN assumes
> it's called with the BQL not held, but in fact there are some
> situations where we call into the memory subsystem and we do
> already have the BQL.
> 
> -- PMM

It's bugs all the way down as usual!
https://xkcd.com/1416/

I'll dig in a little next week to see if there's an easy fix. We can see
the return address is already 0 going into mmu_translate, so it does
look unrelated to the patch I threw out - but probably still has to do
with things being on IO.

~Gregory

Reply via email to