On 1/29/23 22:41, LIU Zhiwei wrote:
On 2023/1/30 13:43, Richard Henderson wrote:
On 1/29/23 16:03, LIU Zhiwei wrote:
Thanks. It's a bug. We should load all memory addresses to local TCG temps
first.
Do you think we should probe all the memory addresses for the store pair instructions?
If so, can we avoid the use of a helper function?
Depends on what the hardware does. Even with a trap in the middle the stores are
restartable, since no register state changes.
I refer to the specification of LDP and STP on AARCH64. The specification allows
"any access performed before the exception was taken is repeated".
In detailed,
"If, according to these rules, an instruction is executed as a sequence of
accesses, exceptions, including interrupts,
can be taken during that sequence, regardless of the memory type being
accessed. If any of these exceptions are
returned from using their preferred return address, the instruction that
generated the sequence of accesses is
re-executed, and so any access performed before the exception was taken is
repeated. See also Taking an interrupt
during a multi-access load or store on page D1-4664."
However I see the implementation of LDP and STP on QEMU are in different ways. LDP will
only load the first register when it ensures no trap in the second access.
So I have two questions here.
1) One for the QEMU implementation about LDP. Can we implement the LDP as two directly
loads to cpu registers instead of local TCG temps?
For the Thead specification, where rd1 != rs1 (and you enforce it), then yes, I suppose
you could load directly to the cpu registers, because on restart rs1 would be unmodified.
For AArch64, which you quote above, there is no constraint that the destinations do not
overlap the address register, so we must implement "LDP r0, r1, [r0]" as a load into temps.
2) One for the comment. Why register state changes cause non-restartable? Do you mean if
the first register changes, it may influence the calculation of address after the trap?
Yes, that's what I mean about non-restartable -- if any of the input registers are changed
before the trap is recognized.
Yes. Conciser what happens when the insn is encoded with .long. Does the hardware trap
an illegal instruction? Is the behavior simply unspecified? The manual could be
improved to specify, akin to the Arm terms: UNDEFINED, CONSTRAINED UNPREDICTABLE,
IMPLEMENTATION DEFINED, etc.
Thanks, I will fix the manual.
Excellent, thanks.
r~