> On 12 Nov 2024, at 6:19 PM, Richard Biener <rguent...@suse.de> wrote:
> 
> External email: Use caution opening links or attachments
> 
> 
> On Mon, 11 Nov 2024, Soumya AR wrote:
> 
>> Hi Richard,
>> 
>>> On 7 Nov 2024, at 6:10 PM, Richard Biener <rguent...@suse.de> wrote:
>>> 
>>> External email: Use caution opening links or attachments
>>> 
>>> 
>>> On Tue, 5 Nov 2024, Soumya AR wrote:
>>> 
>>>> 
>>>> 
>>>>> On 29 Oct 2024, at 7:16 PM, Richard Biener <rguent...@suse.de> wrote:
>>>>> 
>>>>> External email: Use caution opening links or attachments
>>>>> 
>>>>> 
>>>>> On Mon, 28 Oct 2024, Soumya AR wrote:
>>>>> 
>>>>>> This patch transforms the following POW calls to equivalent LDEXP calls, 
>>>>>> as
>>>>>> discussed in PR57492:
>>>>>> 
>>>>>> powi (2.0, i) -> ldexp (1.0, i)
>>>>>> 
>>>>>> a * powi (2.0, i) -> ldexp (a, i)
>>>>>> 
>>>>>> 2.0 * powi (2.0, i) -> ldexp (1.0, i + 1)
>>>>>> 
>>>>>> pow (powof2, i) -> ldexp (1.0, i * log2 (powof2))
>>>>>> 
>>>>>> powof2 * pow (2, i) -> ldexp (1.0, i + log2 (powof2))
>>>>> 
>>>>> For the multiplication cases why not handle powof2 * ldexp (1., i)
>>>>> to ldexp (1., i + log2 (powof2)) and a * ldexp (1., i) -> ldexp (a, i)
>>>>> instead?  exp2 * ldexp (1., i) is another candidate.
>>>>> 
>>>>> So please split out the multiplication parts.
>>>>> 
>>>>> + /* Simplify pow (powof2, i) to ldexp (1, i * log2 (powof2)). */
>>>>> 
>>>>> the below pattern handles POWI, not POW.
>>>>> 
>>>>> + (simplify
>>>>> +  (POWI REAL_CST@0 @1)
>>>>> +  (with { HOST_WIDE_INT tmp = 0;
>>>>> +         tree integer_arg1 = NULL_TREE; }
>>>>> +  (if (integer_valued_real_p (@0)
>>>>> +       && real_isinteger (&TREE_REAL_CST (@0), &tmp)
>>>>> +       && integer_pow2p (integer_arg1 = build_int_cst (integer_type_node,
>>>>> tmp)))
>>>>> 
>>>>> && tmp > 0
>>>>> && pow2p_hwi (tmp)
>>>>> 
>>>>> +    (LDEXP { build_one_cst (type); }
>>>>> +          (mult @1 { build_int_cst (integer_type_node,
>>>>> +                     tree_log2 (integer_arg1)); })))))
>>>>> 
>>>>> build_int_cst (integer_type_node, exact_log2 (tmp))
>>>>> 
>>>>> + /* Simplify powi (2.0, i) to ldexp (1, i). */
>>>>> + (simplify
>>>>> +  (POWI REAL_CST@0 @1)
>>>>> +  (if (real_equal (TREE_REAL_CST_PTR (@0), &dconst2))
>>>>> +   (LDEXP { build_one_cst (type); } @1)))
>>>>> +
>>>>> 
>>>>> You'll have a duplicate pattern here, instead merge them.  2.0
>>>>> is power-of-two so I wonder why the pattern is needed.
>>>> 
>>>> Thanks for the feedback!
>>>> 
>>>> I've merged the extra case that was specifically checking for 2.0.
>>>> 
>>>> Like you suggested, I also added two ldexp specific transforms:
>>>> 
>>>>       • powof2 * ldexp (x, i) -> ldexp (x, i + log2 (powof2))
>>>>       • a * ldexp(1., i) -> ldexp (a, i)
>>>> 
>>>> Regarding your suggestion to also fold exp2, a conversion like
>>>> exp2 (x) -> ldexp (1., x) or exp2 (x) * ldexp (y, i) -> ldexp (y, i + x) 
>>>> is a
>>>> bit tricky because we'd have to cast it to an integer before passing it to
>>>> ldexp.
>>>> 
>>>> real_to_integer only works for constants, which isn't helpful here as exp2 
>>>> (CST)
>>>> becomes a power of 2 anyway and matches with the above patterns.
>>>> 
>>>> We'll have to explicitly convert it for non constants and I'm not sure if 
>>>> that
>>>> is worth it for this patch.
>>>> 
>>>> Let me know what you think.
>>> 
>>> + (simplify
>>> +  (mult:c REAL_CST@0 (POWI REAL_CST@1 @2))
>>> +  (with { HOST_WIDE_INT tmp = 0;
>>> +         tree integer_arg1 = NULL_TREE; }
>>> +  (if (integer_valued_real_p (@0)
>>> +       && real_equal (TREE_REAL_CST_PTR (@1), &dconst2)
>>> +       && real_isinteger (&TREE_REAL_CST (@0), &tmp)
>>> +       && integer_pow2p (integer_arg1 = build_int_cst (integer_type_node,
>>> tmp)))
>>> 
>>> as said you can use tmp > 0 && pow2p_hwi (tmp) instead of
>>> integer_pow2p and build_int_cst.  This applies to all patterns.
>> 
>> Thanks for the feedback! Updated the patch with this modification.
>> 
>>> +  (if (integer_valued_real_p (@0)
>>> +       && real_isinteger (&TREE_REAL_CST (@0), &tmp)
>>> 
>>> also is a redundant check, real_isinteger gives you the answer already.
>>> 
>>> +          (mult @1 {build_int_cst (integer_type_node,
>>> +                    tree_log2 (integer_arg1)); })))))
>>> 
>>> and use exact_log2 (tmp) instead of tree_log2 (integer_arg1).
>> 
>> Fixed.
>> 
>>> 
>>> + /* Simplify a * powi (2.0, i) to ldexp (a, i). */
>>> + (simplify
>>> +  (mult:c @0 (POWI REAL_CST@1 @2))
>>> +  (if (real_equal (TREE_REAL_CST_PTR (@1), &dconst2))
>>> +   (LDEXP @0 @2)))
>>> ...
>>> + /* Simplify powi (2.0, i) to ldexp (1, i). */
>>> + (simplify
>>> +  (POWI REAL_CST@0 @1)
>>> +  (if (real_equal (TREE_REAL_CST_PTR (@0), &dconst2))
>>> +   (LDEXP { build_one_cst (type); } @1)))
>>> 
>>> you can expect the first to never hit as we should replace
>>> a * powi (2.0, i) with a * ldexp (1, i) first.  So you can
>>> drop the first pattern.
>> 
>> Makes sense, apologies for the oversight. Fixed this.
> 
> + /* Simplify powi (2.0, i) to ldexp (1, i). */
> + (simplify
> +  (POWI REAL_CST@0 @1)
> +  (if (real_equal (TREE_REAL_CST_PTR (@0), &dconst2))
> +   (LDEXP { build_one_cst (type); } @1)))
> 
> why do you need this?  The  powi (powof2, i) to ldexp (1, i * log2
> (powof2)) pattern should handle this already?
> 
> + /* Simplify powof2 * powi (2, i) to ldexp (1, i + log2 (powof2)). */
> + (simplify
> +  (mult:c REAL_CST@0 (POWI REAL_CST@1 @2))
> +  (with { HOST_WIDE_INT tmp = 0; }
> +   (if (real_isinteger (&TREE_REAL_CST (@0), &tmp)
> +       && tmp > 0 && pow2p_hwi (tmp))
> +    (LDEXP { build_one_cst (type); }
> +       (plus {build_int_cst (integer_type_node, exact_log2 (tmp)); }
> @2)))))
> +
> + /* Simplify powi (powof2, i) to ldexp (1, i * log2 (powof2)). */
> + (simplify
> +  (POWI REAL_CST@0 @1)
> +  (with { HOST_WIDE_INT tmp = 0; }
> +   (if (real_isinteger (&TREE_REAL_CST (@0), &tmp)
> +       && tmp > 0 && pow2p_hwi (tmp))
> +    (LDEXP { build_one_cst (type); }
> +       (mult @1 {build_int_cst (integer_type_node,
> +            exact_log2 (tmp)); })))))
> 
> with the second pattern you shouldn't see powof2 * powi (2, i)
> since powi (2, i) will be simplified to ldexp already, so the
> powof2 * powi (2, i) simplification should not be necessary.
> 
> Otherwise looks good now.

Removed both redundant cases as suggested.

Best,
Soumya

> 
> Richard.
> 
>> Best,
>> Soumya
>> 
>>> Richard.
>>> 
>>> 
>>>> Best,
>>>> Soumya
>>>> 
>>>> 
>>>> 
>>>>> Richard.
>>>>> 
>>>>>> 
>>>>>> This is especially helpful for SVE architectures as LDEXP calls can be
>>>>>> implemented using the FSCALE instruction, as seen in the following patch:
>>>>>> https://gcc.gnu.org/pipermail/gcc-patches/2024-September/664160.html
>>>>>> 
>>>>>> SPEC2017 was run with this patch, while there are no noticeable 
>>>>>> improvements,
>>>>>> there are no non-noise regressions either.
>>>>>> 
>>>>>> The patch was bootstrapped and regtested on aarch64-linux-gnu, no 
>>>>>> regression.
>>>>>> OK for mainline?
>>>>>> 
>>>>>> Signed-off-by: Soumya AR <soum...@nvidia.com>
>>>>>> 
>>>>>> gcc/ChangeLog:
>>>>>>    PR target/57492
>>>>>>    * match.pd: Added patterns to fold certain calls to pow to ldexp.
>>>>>> 
>>>>>> gcc/testsuite/ChangeLog:
>>>>>>    PR target/57492
>>>>>>    * gcc.dg/tree-ssa/pow-to-ldexp.c: New test.
>>>>>> 
>>>>>> 
>>>>> 
>>>>> --
>>>>> Richard Biener <rguent...@suse.de>
>>>>> SUSE Software Solutions Germany GmbH,
>>>>> Frankenstrasse 146, 90461 Nuernberg, Germany;
>>>>> GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)
>>>> 
>>>> 
>>>> 
>>> 
>>> --
>>> Richard Biener <rguent...@suse.de>
>>> SUSE Software Solutions Germany GmbH,
>>> Frankenstrasse 146, 90461 Nuernberg, Germany;
>>> GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)
>> 
>> 
>> 
> 
> --
> Richard Biener <rguent...@suse.de>
> SUSE Software Solutions Germany GmbH,
> Frankenstrasse 146, 90461 Nuernberg, Germany;
> GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)


Attachment: v5-0001-Match-Fold-pow-calls-to-ldexp-when-possible-PR57492.patch
Description: v5-0001-Match-Fold-pow-calls-to-ldexp-when-possible-PR57492.patch

Reply via email to