On 28/10/2025 13:29, Richard Biener wrote:
Isn't SLP_TREE_CAN_USE_PARTIAL_VECTORS_P redundant given
SLP_TREE_CAN_USE_MASK_P || SLP_TREE_CAN_USE_LEN_P should be exactly this?
SLP_TREE_CAN_USE_PARTIAL_VECTORS_P might be sth for an SLP instance
(or a subgraph with multiple entries (instances)) if we want to have
consistent len vs. mask use? (but I see no particular reason to force
consistency)
SLP_TREE_CAN_USE_PARTIAL_VECTORS_P is initialised to true and may
subsequently be set to false via vect_cannot_use_partial_vectors.
vect_analyze_stmt uses the value of SLP_TREE_CAN_USE_PARTIAL_VECTORS_P
to decided whether to return a failure result in cases where
tail-predication is required. If SLP_TREE_CAN_USE_MASK_P ||
SLP_TREE_CAN_USE_LEN_P were used for that purpose instead, it would
follow that neither SLP_TREE_CAN_USE_MASK_P nor SLP_TREE_CAN_USE_LEN_P
could be set to true in cases where vect_cannot_use_partial_vectors
might subsequently be called (which seems impossible because
vect_load_lanes_supported can be called with different values of
'count', and we cannot predict those values) , or else that
vect_cannot_use_partial_vectors would have to set both
SLP_TREE_CAN_USE_MASK_P and SLP_TREE_CAN_USE_LEN_P to false.
Setting both flags to false in vect_cannot_use_partial_vectors would be
trivial, but SLP_TREE_CAN_USE_PARTIAL_VECTORS_P has another purpose: it
gives the return value of vect_can_use_partial_vectors_p. How did you
envisage the return value of vect_can_use_partial_vectors_p being
decided for BB SLP? If it always returns true then I think that the
vectoriser might carry on trying to use partial vectors when it should
have already given up; if it returns SLP_TREE_CAN_USE_MASK_P ||
SLP_TREE_CAN_USE_LEN_P then that would prevent
check_load_store_for_partial_vectors from being called for the first time.
Together, the three flags allow the following states to be represented:
1. Might be able to operate on partial vectors, but don't yet know
whether we would use len or mask.
2. Might be able to operate on partial vectors with len.
3. Might be able to operate on partial vectors with mask.
4. (Invalid) Might be able to operate on partial vectors with both len
and mask.
5. Cannot operate on partial vectors.
6. (Strictly redundant) Cannot operate on partial vectors although we
previously thought we might be able to use len.
7. (Strictly redundant) Cannot operate on partial vectors although we
previously thought we might be able to use mask.
8. (Invalid) Cannot operate on partial vectors although we previously
thought we might be able to use both len and mask.
It would be impossible to encode the four states that neither invalid
nor redundant in only two bits. In any case, my goal was to keep the new
logic for BB SLP as close as possible to the existing logic for loop
vectorisation.
--
Christopher Bazley
Staff Software Engineer, GNU Tools Team.
Arm Ltd, 110 Fulbourn Road, Cambridge, CB1 9NJ, UK.
http://www.arm.com/