> On 18 Sep 2024, at 20:33, Richard Sandiford <richard.sandif...@arm.com> wrote:
> 
> External email: Use caution opening links or attachments
> 
> 
> Jennifer Schmitz <jschm...@nvidia.com> writes:
>> From 05e010a4ad5ef8df082b3e03b253aad85e2a270c Mon Sep 17 00:00:00 2001
>> From: Jennifer Schmitz <jschm...@nvidia.com>
>> Date: Tue, 17 Sep 2024 00:15:38 -0700
>> Subject: [PATCH] SVE intrinsics: Fold svmul with all-zero operands to zero
>> vector
>> 
>> As recently implemented for svdiv, this patch folds svmul to a zero
>> vector if one of the operands is a zero vector. This transformation is
>> applied if at least one of the following conditions is met:
>> - the first operand is all zeros or
>> - the second operand is all zeros, and the predicate is ptrue or the
>> predication is _x or _z.
>> 
>> In contrast to constant folding, which was implemented in a previous
>> patch, this transformation is applied as soon as one of the operands is
>> a zero vector, while the other operand can be a variable.
>> 
>> The patch was bootstrapped and regtested on aarch64-linux-gnu, no regression.
>> OK for mainline?
>> 
>> Signed-off-by: Jennifer Schmitz <jschm...@nvidia.com>
> 
> OK, thanks.
> 
> If you're planning any more work in this area, I think the next logical
> step would be to extend the current folds to all predication types,
> before going on to support other mul/div cases or other operations.
> 
> In principle, the mul and div cases correspond to:
> 
>  if (integer_zerop (op1) || integer_zerop (op2))
>    return f.fold_active_lanes_to (build_zero_cst (TREE_TYPE (f.lhs)));
> 
> It would then be up to fold_active_lanes_to(X) to work out how to apply
> predication to X.  The general case would be:
> 
>  - For x predication and unpredicated operations, fold to X.
> 
>  - For m and z, calculate a vector that supplies the values of inactive
>    lanes (the first vector argument for m and a zero vector from z).
> 
>    - If X is equal to the inactive lanes vector, fold directly to X.
> 
>    - Otherwise fold to VEC_COND_EXPR <pg, X, inactive>
Dear Richard,
I pushed it to trunk with 08aba2dd8c9390b6131cca0aac069f97eeddc9d2.
Thank you also for the good suggestion, I will do that. During the last days, I 
have been working on a patch that folds multiplication by powers of 2 to 
left-shifts (svlsl), similar to for division. As I see it, that is independent 
from what you proposed, because it is a change of the function type. Can I 
submit it for review before starting on the patch you suggested?
Best, Jennifer
> 
> Richard


Attachment: smime.p7s
Description: S/MIME cryptographic signature

Reply via email to