On Fri, 28 Feb 2025 at 00:18, Richard Henderson <richard.hender...@linaro.org> wrote: > > On 2/27/25 09:58, Peter Maydell wrote: > > On Thu, 27 Feb 2025 at 17:41, Richard Henderson > > <richard.hender...@linaro.org> wrote: > >> > >> On 2/27/25 06:27, Peter Maydell wrote: > >>> +static void do_ldrd_load(DisasContext *s, TCGv_i32 addr, int rt, int rt2) > >>> +{ > >>> + /* > >>> + * LDRD is required to be an atomic 64-bit access if the > >>> + * address is 8-aligned, two atomic 32-bit accesses if > >>> + * it's only 4-aligned, and to give an alignemnt fault > >>> + * if it's not 4-aligned. > >>> + * Rt is always the word from the lower address, and Rt2 the > >>> + * data from the higher address, regardless of endianness. > >>> + * So (like gen_load_exclusive) we avoid gen_aa32_ld_i64() > >>> + * so we don't get its SCTLR_B check, and instead do a 64-bit access > >>> + * using MO_BE if appropriate and then split the two halves. > >>> + * > >>> + * This also gives us the correct behaviour of not updating > >>> + * rt if the load of rt2 faults; this is required for cases > >>> + * like "ldrd r2, r3, [r2]" where rt is also the base register. > >>> + */ > >>> + int mem_idx = get_mem_index(s); > >>> + MemOp opc = MO_64 | MO_ALIGN_4 | MO_ATOM_SUBALIGN | s->be_data; > >> > >> The 64-bit atomicity begins with armv7 + LPAE, and not present for any > >> m-profile. > >> Worth checking ARM_FEATURE_LPAE, or at least adding to the comment? > >> > >> Getting 2 x 4-byte atomicity, but not require 8-byte atomicity, would use > >> MO_ATOM_IFALIGN_PAIR. > > > > Definitely worth a comment at minimum. Do we generate better > > code for MO_ATOM_IFALIGN_PAIR ? (If not, then providing higher > > atomicity than the architecture mandates seems harmless.) > > We could, but currently do not, generate better code for IFALIGN_PAIR for > MO_64. > Currently the only place we take special care is for MO_128.
OK, in that case I'll just add to the comment: * For M-profile, and for A-profile before LPAE, the 64-bit * atomicity is not required. We could model that using * the looser MO_ATOM_IFALIGN_PAIR, but providing a higher * level of atomicity than required is harmless (we would not * currently generate better code for IFALIGN_PAIR here). > > For the comment in memop.h that currently reads > > * MO_ATOM_SUBALIGN: the operation is single-copy atomic by parts > > * by the alignment. E.g. if the address is 0 mod 4, then each > > * 4-byte subobject is single-copy atomic. > > * This is the atomicity e.g. of IBM Power. > > > > maybe we could expand the e.g: > > > > E.g if an 8-byte value is accessed at an address which is 0 mod 8, > > then the whole 8-byte access is single-copy atomic; otherwise, > > if it is accessed at 0 mod 4 then each 4-byte subobject is > > single-copy atomic; otherwise if it is accessed at 0 mod 2 > > then the four 2-byte subobjects are single-copy atomic. > > > > ? > > Yes, that's correct. > > > I wasn't sure when reading what we currently have whether > > it provided the 8-byte-aligned guarantee, rather than merely > > the 4-byte-aligned one. > > I was trying to highlight the difference between SUBALIGN and IFALIGN, and > perhaps didn't > do adequate job of it. I reviewed the comment patch, and presumably wasn't confused at the time because I'd just read through the corresponding code changes :-) I'll send in a patch adjusting the comment to expand the example. thanks -- PMM