> On 18 Sep 2024, at 20:33, Richard Sandiford <richard.sandif...@arm.com> wrote: > > External email: Use caution opening links or attachments > > > Jennifer Schmitz <jschm...@nvidia.com> writes: >> From 05e010a4ad5ef8df082b3e03b253aad85e2a270c Mon Sep 17 00:00:00 2001 >> From: Jennifer Schmitz <jschm...@nvidia.com> >> Date: Tue, 17 Sep 2024 00:15:38 -0700 >> Subject: [PATCH] SVE intrinsics: Fold svmul with all-zero operands to zero >> vector >> >> As recently implemented for svdiv, this patch folds svmul to a zero >> vector if one of the operands is a zero vector. This transformation is >> applied if at least one of the following conditions is met: >> - the first operand is all zeros or >> - the second operand is all zeros, and the predicate is ptrue or the >> predication is _x or _z. >> >> In contrast to constant folding, which was implemented in a previous >> patch, this transformation is applied as soon as one of the operands is >> a zero vector, while the other operand can be a variable. >> >> The patch was bootstrapped and regtested on aarch64-linux-gnu, no regression. >> OK for mainline? >> >> Signed-off-by: Jennifer Schmitz <jschm...@nvidia.com> > > OK, thanks. > > If you're planning any more work in this area, I think the next logical > step would be to extend the current folds to all predication types, > before going on to support other mul/div cases or other operations. > > In principle, the mul and div cases correspond to: > > if (integer_zerop (op1) || integer_zerop (op2)) > return f.fold_active_lanes_to (build_zero_cst (TREE_TYPE (f.lhs))); > > It would then be up to fold_active_lanes_to(X) to work out how to apply > predication to X. The general case would be: > > - For x predication and unpredicated operations, fold to X. > > - For m and z, calculate a vector that supplies the values of inactive > lanes (the first vector argument for m and a zero vector from z). > > - If X is equal to the inactive lanes vector, fold directly to X. > > - Otherwise fold to VEC_COND_EXPR <pg, X, inactive> Dear Richard, I pushed it to trunk with 08aba2dd8c9390b6131cca0aac069f97eeddc9d2. Thank you also for the good suggestion, I will do that. During the last days, I have been working on a patch that folds multiplication by powers of 2 to left-shifts (svlsl), similar to for division. As I see it, that is independent from what you proposed, because it is a change of the function type. Can I submit it for review before starting on the patch you suggested? Best, Jennifer > > Richard
smime.p7s
Description: S/MIME cryptographic signature