arm: Correct LDRD atomicity and fault behaviour

Peter Maydell Fri, 28 Feb 2025 01:54:12 -0800

On Fri, 28 Feb 2025 at 00:18, Richard Henderson
<richard.hender...@linaro.org> wrote:
>
> On 2/27/25 09:58, Peter Maydell wrote:
> > On Thu, 27 Feb 2025 at 17:41, Richard Henderson
> > <richard.hender...@linaro.org> wrote:
> >>
> >> On 2/27/25 06:27, Peter Maydell wrote:
> >>> +static void do_ldrd_load(DisasContext *s, TCGv_i32 addr, int rt, int rt2)
> >>> +{
> >>> +    /*
> >>> +     * LDRD is required to be an atomic 64-bit access if the
> >>> +     * address is 8-aligned, two atomic 32-bit accesses if
> >>> +     * it's only 4-aligned, and to give an alignemnt fault
> >>> +     * if it's not 4-aligned.
> >>> +     * Rt is always the word from the lower address, and Rt2 the
> >>> +     * data from the higher address, regardless of endianness.
> >>> +     * So (like gen_load_exclusive) we avoid gen_aa32_ld_i64()
> >>> +     * so we don't get its SCTLR_B check, and instead do a 64-bit access
> >>> +     * using MO_BE if appropriate and then split the two halves.
> >>> +     *
> >>> +     * This also gives us the correct behaviour of not updating
> >>> +     * rt if the load of rt2 faults; this is required for cases
> >>> +     * like "ldrd r2, r3, [r2]" where rt is also the base register.
> >>> +     */
> >>> +    int mem_idx = get_mem_index(s);
> >>> +    MemOp opc = MO_64 | MO_ALIGN_4 | MO_ATOM_SUBALIGN | s->be_data;
> >>
> >> The 64-bit atomicity begins with armv7 + LPAE, and not present for any 
> >> m-profile.
> >> Worth checking ARM_FEATURE_LPAE, or at least adding to the comment?
> >>
> >> Getting 2 x 4-byte atomicity, but not require 8-byte atomicity, would use
> >> MO_ATOM_IFALIGN_PAIR.
> >
> > Definitely worth a comment at minimum. Do we generate better
> > code for MO_ATOM_IFALIGN_PAIR ? (If not, then providing higher
> > atomicity than the architecture mandates seems harmless.)
>
> We could, but currently do not, generate better code for IFALIGN_PAIR for 
> MO_64.
> Currently the only place we take special care is for MO_128.


OK, in that case I'll just add to the comment:

   * For M-profile, and for A-profile before LPAE, the 64-bit
   * atomicity is not required. We could model that using
   * the looser MO_ATOM_IFALIGN_PAIR, but providing a higher
   * level of atomicity than required is harmless (we would not
   * currently generate better code for IFALIGN_PAIR here).

> > For the comment in memop.h that currently reads
> >       * MO_ATOM_SUBALIGN: the operation is single-copy atomic by parts
> >       *    by the alignment.  E.g. if the address is 0 mod 4, then each
> >       *    4-byte subobject is single-copy atomic.
> >       *    This is the atomicity e.g. of IBM Power.
> >
> > maybe we could expand the e.g:
> >
> >    E.g if an 8-byte value is accessed at an address which is 0 mod 8,
> >    then the whole 8-byte access is single-copy atomic; otherwise,
> >    if it is accessed at 0 mod 4 then each 4-byte subobject is
> >    single-copy atomic; otherwise if it is accessed at 0 mod 2
> >    then the four 2-byte subobjects are single-copy atomic.
> >
> > ?
>
> Yes, that's correct.
>
> > I wasn't sure when reading what we currently have whether
> > it provided the 8-byte-aligned guarantee, rather than merely
> > the 4-byte-aligned one.
>
> I was trying to highlight the difference between SUBALIGN and IFALIGN, and 
> perhaps didn't
> do adequate job of it.

I reviewed the comment patch, and presumably wasn't confused
at the time because I'd just read through the corresponding
code changes :-)

I'll send in a patch adjusting the comment to expand the example.

thanks
-- PMM

Re: [PATCH 1/3] target/arm: Correct LDRD atomicity and fault behaviour

Reply via email to