Aaron Lindsay <aa...@os.amperecomputing.com> writes:
> On Oct 20 18:54, Alex Bennée wrote: >> Have you got a test case you are using so I can try and replicate the >> failure you are seeing? So far by inspection everything looks OK to me. > > I took some time today to put together a minimal(ish) reproducer using > usermode. The source files used are below, I compiled the test binary on an > AArch64 system using: > > $ gcc -g -o stxp stxp.s stxp.c > > Then built the plugin from stxp_plugin.cc, and ran it all like: > > qemu-aarch64 \ > -cpu cortex-a57 \ > -D stxp_plugin.log \ > -d plugin \ > -plugin 'stxp_plugin.so' \ > ./stxp > > I observe that, for me, the objdump of stxp contains: > 000000000040070c <loop>: > 40070c: f9800011 prfm pstl1strm, [x0] > 400710: c87f4410 ldxp x16, x17, [x0] > 400714: c8300c02 stxp w16, x2, x3, [x0] > 400718: f1000652 subs x18, x18, #0x1 > 40071c: 54000040 b.eq 400724 <done> // b.none > 400720: 17fffffb b 40070c <loop> > > But the output in stxp_plugin.log looks something like: > Executing PC: 0x40070c > Executing PC: 0x400710 > PC 0x400710 accessed memory at 0x550080ec70 > PC 0x400710 accessed memory at 0x550080ec78 > Executing PC: 0x400714 > Executing PC: 0x400718 > Executing PC: 0x40071c > Executing PC: 0x400720 > > From this, I believe the ldxp instruction at PC 0x400710 is reporting two > memory accesses but the stxp instruction at 0x400714 is not. This is fascinating but I can't replicate your results. I get the following pattern: Executing PC: 0x400910 Executing PC: 0x400914 PC 0x400914 accessed memory at 0x55007fffd0 PC 0x400914 accessed memory at 0x55007fffd8 Executing PC: 0x400918 PC 0x400918 accessed memory at 0x55007fffd0 PC 0x400918 accessed memory at 0x55007fffd8 PC 0x400918 accessed memory at 0x55007fffd0 PC 0x400918 accessed memory at 0x55007fffd8 Executing PC: 0x40091c Executing PC: 0x400920 Executing PC: 0x400924 Executing PC: 0x400910 Executing PC: 0x400914 PC 0x400914 accessed memory at 0x55007fffd0 PC 0x400914 accessed memory at 0x55007fffd8 Executing PC: 0x400918 PC 0x400918 accessed memory at 0x55007fffd0 PC 0x400918 accessed memory at 0x55007fffd8 PC 0x400918 accessed memory at 0x55007fffd0 PC 0x400918 accessed memory at 0x55007fffd8 Executing PC: 0x40091c Executing PC: 0x400920 Executing PC: 0x400924 Executing PC: 0x400910 Executing PC: 0x400914 PC 0x400914 accessed memory at 0x55007fffd0 PC 0x400914 accessed memory at 0x55007fffd8 Executing PC: 0x400918 PC 0x400918 accessed memory at 0x55007fffd0 PC 0x400918 accessed memory at 0x55007fffd8 PC 0x400918 accessed memory at 0x55007fffd0 PC 0x400918 accessed memory at 0x55007fffd8 Executing PC: 0x40091c Executing PC: 0x400920 Executing PC: 0x400924 Executing PC: 0x400910 Executing PC: 0x400914 PC 0x400914 accessed memory at 0x55007fffd0 PC 0x400914 accessed memory at 0x55007fffd8 Executing PC: 0x400918 PC 0x400918 accessed memory at 0x55007fffd0 PC 0x400918 accessed memory at 0x55007fffd8 PC 0x400918 accessed memory at 0x55007fffd0 PC 0x400918 accessed memory at 0x55007fffd8 It's a bit clearer if you use the contrib/execlog plugin: ./qemu-aarch64 -plugin contrib/plugins/libexeclog.so -d plugin ./tests/tcg/aarch64-linux-user/stxp 0, 0x400910, 0xf9800011, "prfm pstl1strm, [x0] 0, 0x400914, 0xc87f4410, "ldxp x16, x17, [x0]", load, 0x55007fffd0, load, 0x55007fffd8 0, 0x400918, 0xc8300c02, "stxp w16, x2, x3, [x0]", load, 0x55007fffd0, load, 0x55007fffd8, store, 0x55007fffd0, store, 0x55007fffd8 0, 0x40091c, 0xf1000652, "subs x18, x18, #1" 0, 0x400920, 0x54000040, "b.eq #0x400928" 0, 0x400924, 0x17fffffb, "b #0x400910" 0, 0x400910, 0xf9800011, "prfm pstl1strm, [x0] 0, 0x400914, 0xc87f4410, "ldxp x16, x17, [x0]", load, 0x55007fffd0, load, 0x55007fffd8 0, 0x400918, 0xc8300c02, "stxp w16, x2, x3, [x0]", load, 0x55007fffd0, load, 0x55007fffd8, store, 0x55007fffd0, store, 0x55007fffd8 0, 0x40091c, 0xf1000652, "subs x18, x18, #1" 0, 0x400920, 0x54000040, "b.eq #0x400928" 0, 0x400924, 0x17fffffb, "b #0x400910" 0, 0x400910, 0xf9800011, "prfm pstl1strm, [x0] 0, 0x400914, 0xc87f4410, "ldxp x16, x17, [x0]", load, 0x55007fffd0, load, 0x55007fffd8 0, 0x400918, 0xc8300c02, "stxp w16, x2, x3, [x0]", load, 0x55007fffd0, load, 0x55007fffd8, store, 0x55007fffd0, store, 0x55007fffd8 0, 0x40091c, 0xf1000652, "subs x18, x18, #1" 0, 0x400920, 0x54000040, "b.eq #0x400928" 0, 0x400924, 0x17fffffb, "b #0x400910" 0, 0x400910, 0xf9800011, "prfm pstl1strm, [x0] 0, 0x400914, 0xc87f4410, "ldxp x16, x17, [x0]", load, 0x55007fffd0, load, 0x55007fffd8 0, 0x400918, 0xc8300c02, "stxp w16, x2, x3, [x0]", load, 0x55007fffd0, load, 0x55007fffd8, store, 0x55007fffd0, store, 0x55007fffd8 0, 0x40091c, 0xf1000652, "subs x18, x18, #1" 0, 0x400920, 0x54000040, "b.eq #0x400928" 0, 0x400924, 0x17fffffb, "b #0x400910" 0, 0x400910, 0xf9800011, "prfm pstl1strm, [x0] 0, 0x400914, 0xc87f4410, "ldxp x16, x17, [x0]", load, 0x55007fffd0, load, 0x55007fffd8 0, 0x400918, 0xc8300c02, "stxp w16, x2, x3, [x0]", load, 0x55007fffd0, load, 0x55007fffd8, store, 0x55007fffd0, store, 0x55007fffd8 0, 0x40091c, 0xf1000652, "subs x18, x18, #1" 0, 0x400920, 0x54000040, "b.eq #0x400928" 0, 0x400924, 0x17fffffb, "b #0x400910" Although you can see stxp looks a bit weird on account of the loads it does during the cmpxchng. So consider me stumped. The only thing I can thing of next is to see how closely I can replicate your build environment. -- Alex Bennée