> Yeah, was just going to reply to your earlier message saying
> something similar. (cond_exec ...) might be a stretch for something
> like a predicated addition, since on SVE the set is unconditional:
> it's always a full register write regardless of the predicate.
> Having the predication on the rhs seems more accurate there.
>
> But we lack a good way of representing predicated stores. Currently
> SVE uses a read-modify-write of memory, but of course that isn't
> accurate, since rmw would fault on unmapped addresses. A per-lane
> cond_exec set could be good for that.
Could we re-use MEM_NOTRAP_P here? I don't remember it being really pervasive,
and only used in a few places, though. Also not fully accurate as we can still
trap on active lanes?
> The same problem occurs for predicated loads. (vec_predicate mem ...)
> would in principle work there, but hiding a mem would be a big change,
> and would raise the question of where the MEM_ATTRs would go. Urgh...
I'd prefer the vec_predicate/... on the RHS like in the read-modify-write way:
(set (mem:V4SI ...)
(vec_predicate:V4SI "STORE_EXPR"
(reg:V4SI ...) # reg to store
(mem:V4SI addr) # merge/old value, could also hold MEM_ATTRs
# like MEM_NOTRAP_P.
(mask) (length) ...))
But we would need something like LOAD_EXPR and STORE_EXPR if we moved the
operation inside the vec_predicate? :/
> The code could be stored in the "u2" field of the rtx. And putting
> the [A B] first would fit better with existing assumptions, since IIRC
> XVEC (x, n) for n > 0 doesn't occur outside of build-time generators.
>
> This again avoids contextual interpretation. A predicated plus is not
> equivalent to taking an existing unpredicated plus and predicating it,
> and vice versa.
How important are these properties? I would hope the first one isn't too big
of a deal?
--
Regards
Robin