of memory ops through, TARGET_SCHED_ADJUST_PRIORITY, but it was innefective.
I’m a bit at a loss what’s likely going on with the RA at this point. Any
pointers?
Thank you,
--
Evandro Menezes
> Em 16 de mai. de 2023, à(s) 03:36, Kyrylo Tkachov
> escreveu:
>
> Hi Evandro,
>
> I think that was more down to my rushed model rather than anything else
> though.
>
> Thanks,
> Kyrill
>
> From: Evandro Menezes
> Sent: Monday, May 15, 2023 9:13 PM
> To: Kyrylo Tkachov
> Cc: Richard Sandiford ; Evandro Menezes via
> Gcc-patches ; evandro+.
mention with regards to
granularity?
Yes, my intent for this patch is to enable modeling the SVE instructions on N1.
The patch that implements it brings up some performance improvements, but it’s
mostly flat, as expected.
Thank you,
--
Evandro Menezes
> Em 15 de mai. de 2023, à(s) 04:49, Kyr
instructions in its group.
Do you have specific instances in mind?
Thank you,
--
Evandro Menezes
> Em 15 de mai. de 2023, à(s) 04:00, Richard Sandiford
> escreveu:
>
> Evandro Menezes via Gcc-patches writes:
>> This patch adds the attribute `type` to most SVE1 instructions, a
This patch adds the attribute `type` to most SVE1 instructions, as in the other
instructions.
--
Evandro Menezes
0002-aarch64-Add-SVE-instruction-types.patch
Description: Binary data
This patch adds the scheduling model for Neoverse V1, based on the information
from the “Arm Neoverse V1 Software Optimization Guide” and on static and
dynamic analysis of internal and public benchmarks. Results are forthcoming.
--
Evandro Menezes
0001-aarch64-Add-scheduling-model-for
Sorry, but it seems that, before sending, the email client is stripping leading
spaces. I’m attaching the file here.
--
Evandro Menezes ◊ evan...@yahoo.com ◊ Austin, TX
Άγιος ο Θεός ⁂ ܩܕܝܫܐ ܐܢ̱ܬ ܠܐ ܡܝܘܬܐ ⁂ Sanctus Deus
> Em 24 de abr. de 2023, à(s) 17:48, Evandro Menezes
> escreveu:
&
Hi, Tamara.
Does this work?
Thank you,
--
Evandro Menezes ◊ evan...@yahoo.com ◊ Austin, TX
Άγιος ο Θεός ⁂ ܩܕܝܫܐ ܐܢ̱ܬ ܠܐ ܡܝܘܬܐ ⁂ Sanctus Deus
> Em 24 de abr. de 2023, à(s) 12:37, Tamar Christina
> escreveu:
>
> Hi Evandro,
>
> I wanted to give this patch a try, but the
This patch adds the cost model for Neoverse N1, based on the information from
the "Arm Neoverse N1 Software Optimization Guide”.
--
Evandro Menezes
gcc/ChangeLog:
* config/aarch64/aarch64-cores.def
This patch adds the scheduling model for Neoverse N1, based on the information
from the "Arm Neoverse N1 Software Optimization Guide”.
--
Evandro Menezes
gcc/ChangeLog:
* config/aarch64/aarch64-core
Hi, Kyrylo.
> Em 11 de abr. de 2023, à(s) 04:41, Kyrylo Tkachov
> escreveu:
>
>> -Original Message-
>> From: Gcc-patches > bounces+kyrylo.tkachov=arm@gcc.gnu.org
>> <mailto:bounces+kyrylo.tkachov=arm@gcc.gnu.org>> On Behalf Of Evandro
>
This patch adds the cost and scheduling models for Neoverse N1, based on the
information from the "Arm Neoverse N1 Software Optimization Guide”.
--
Evandro Menezes ◊ evan...@yahoo.com
[PATCH] aarch64: Add the cost and scheduling models for Neoverse N1
gcc/ChangeLog:
* config/aa
stp x2, x3, [x0]
ret
whereas with this patch we generate:
bar:
ldp x2, x3, [x1, 8]
stp x2, x3, [x0, 8]
ret
Bootstrapped and tested on aarch64-none-linux-gnu.
Ok for trunk?
LGTM
--
Evandro Menezes
-systems.com; Evandro Menezes
Subject: [PATCH][AArch64] Increase code alignment
Increase loop alignment on Cortex cores to 8 and set function alignment to
16. This makes things consistent across big.LITTLE cores, improves
performance of benchmarks with tight loops and reduces performance
larger benchmarks tonight, but I'm leaning towards having it as
a
> target specific extra tuning option.
The results are in and -frename-registers is not a good idea for Exynos M1.
Thank you,
--
Evandro Menezes Austin, TX
significant
improvements for me to be comfortable with -frename-registers being a
generic default for AArch64.
I'll run some larger benchmarks tonight, but I'm leaning towards having it
as a target specific extra tuning option.
Thank you,
--
Evandro Menezes
On 06/14/16 03:28, Christophe Lyon wrote:
On 13 June 2016 at 21:06, Evandro Menezes wrote:
On 06/13/16 05:15, James Greenhalgh wrote:
Thanks for your patience on this patch series.
Just checked the series in.
If I'm not mistaken, it looks like you forgot to update the ChangeLog
fil
On 06/13/16 05:15, James Greenhalgh wrote:
Thanks for your patience on this patch series.
Just checked the series in.
Thank y'all for your assistance and patience.
Cheers,
--
Evandro Menezes
On 06/03/16 17:22, Evandro Menezes wrote:
On 06/03/16 05:51, Wilco Dijkstra wrote:
It looks almost all AArch64 cores agree on alignment of 16 for
function, and 8 for loops and branches, so we should change
-mcpu=generic as well if there is no disagreement - feedback welcome.
I'll see
most comfortable with,
but I also wonder if the -falign-labels shouldn't also be a parameter in
tune_params.
Thoughts?
--
Evandro Menezes
Rebasing the patch...
--
Evandro Menezes
>From d791090aae6a29fa94d8fc10894ee1053b05bcc2 Mon Sep 17 00:00:00 2001
From: Evandro Menezes
Date: Mon, 4 Apr 2016 14:02:24 -0500
Subject: [PATCH 3/3] [AArch64] Emit division using the Newton series
2016-04-04 Evandro Menezes
Wi
On 06/01/16 03:35, James Greenhalgh wrote:
On Fri, May 27, 2016 at 05:57:23PM -0500, Evandro Menezes wrote:
From 86d7690632d03ec85fd69bfaef8e89c0542518ad Mon Sep 17 00:00:00 2001
From: Evandro Menezes
Date: Thu, 3 Mar 2016 18:13:46 -0600
Subject: [PATCH 1/3] [AArch64] Add more choices for the
On 06/01/16 04:00, James Greenhalgh wrote:
On Fri, May 27, 2016 at 05:57:26PM -0500, Evandro Menezes wrote:
2016-04-04 Evandro Menezes
Wilco Dijkstra
gcc/
* config/aarch64/aarch64-protos.h
(aarch64_emit_approx_rsqrt): Replace with new function
On 06/03/16 07:56, Wilco Dijkstra wrote:
This patch cleans up the -mpc-relative-loads option processing. Rename to
avoid the
"no*" name and confusing !no* expressions. Fix the option processing code to
implement
-mno-pc-relative-loads rather than ignore it.
OK for commit?
LGTM
On 06/02/16 09:54, Kyrill Tkachov wrote:
The Qualcomm QDF24xx processor is now supported via the
Shouldn't this read "The Qualcomm QDF24xx processors are now supported
via the"?
Not that I have a strong opinion about it, but, otherwise, OK.
--
Evandro Menezes
On 05/31/16 04:27, James Greenhalgh wrote:
On Fri, May 27, 2016 at 05:57:30PM -0500, Evandro Menezes wrote:
On 05/25/16 11:16, James Greenhalgh wrote:
On Wed, Apr 27, 2016 at 04:15:53PM -0500, Evandro Menezes wrote:
gcc/
* config/aarch64/aarch64-protos.h
(tune_params
Kyrylo Tkachov
* config/aarch64/aarch64.c (aarch_macro_fusion_pair_p): Use
aarch64_fusion_enabled_p to check for fusion capabilities.
LGTM
--
Evandro Menezes
On 05/25/16 11:16, James Greenhalgh wrote:
On Wed, Apr 27, 2016 at 04:15:53PM -0500, Evandro Menezes wrote:
gcc/
* config/aarch64/aarch64-protos.h
(tune_params): Add new member "approx_div_modes".
(aarch64_emit_approx_div): Declare new function.
On 05/25/16 10:52, James Greenhalgh wrote:
On Wed, Apr 27, 2016 at 04:15:45PM -0500, Evandro Menezes wrote:
gcc/
* config/aarch64/aarch64-protos.h
(aarch64_emit_approx_rsqrt): Replace with new function
"aarch64_emit_approx_sqrt".
(tune_params):
On 05/25/16 05:15, James Greenhalgh wrote:
On Wed, Apr 27, 2016 at 04:13:33PM -0500, Evandro Menezes wrote:
gcc/
* config/aarch64/aarch64-protos.h
(AARCH64_APPROX_MODE): New macro.
(AARCH64_APPROX_{NONE,SP,DP,DFORM,QFORM,SCALAR,VECTOR,ALL}):
Likewise
On 05/23/16 15:32, Evandro Menezes wrote:
I'm fine with this patch, as it achieves in part what I intended
before: going beyond the default_case_values_threshold, too
conservative for Exynos M1. My concern is particularly what happens
to in-order targets, like the ubiquitous A53.
ue to code alignment or some other secondary effect.
I always thought that this patch, that lays out the branch tree more
optimally, deserved to be revisited:
https://gcc.gnu.org/ml/gcc-patches/2008-04/msg02197.html
Cheers,
--
Evandro Menezes
On 04/27/16 16:13, Evandro Menezes wrote:
This patch suite increases the granularity of target selections of
approximate FP operations and adds the options of emitting approximate
square root and division.
The full suite is contained in the emails tagged:
1.
[PATCH 1/3][AArch64] Add more
n.
Cheers,
--
Evandro Menezes
Define new function.
* config/aarch64/aarch64.md ("div3"): New expansion.
* config/aarch64/aarch64-simd.md ("div3"): Likewise.
* config/aarch64/aarch64.opt (-mlow-precision-div): Add new option.
* doc/invoke.texi (-mlow-precision-div): Describe
n and insn definitions.
* config/aarch64/aarch64.md: Likewise.
* config/aarch64/aarch64.opt
(mlow-precision-sqrt): Add new option description.
* doc/invoke.texi (mlow-precision-sqrt): Likewise.
--
Evandro Menezes
>From 753115a8691afd7aed4a510d9e9cb0a8e859acf4 Mon Sep 1
(aarch64_optab_supported_p): New argument for the mode.
* doc/invoke.texi (-mlow-precision-recip-sqrt): Reword description.
--
Evandro Menezes
>From 2cb6c0f35bbdc3b4cc6f88c61a50f3fbb168ec99 Mon Sep 17 00:00:00 2001
From: Evandro Menezes
Date: Thu, 3 Mar 2016 18:13:46 -0600
Subjec
approximation
2.
[PATCH 2/3][AArch64] Emit square root using the Newton series
3.
[PATCH 3/3][AArch64] Emit division using the Newton series
Thank you,
--
Evandro Menezes
On 04/26/16 08:25, Wilco Dijkstra wrote:
Evandro Menezes wrote:
On 03/10/16 10:37, James Greenhalgh wrote:
Thanks for sticking with it. This is OK for GCC 7 when development
opens.
Remember to mention the most recent changes in your Changelog entry
(Remove "fp" attribute from *mov
On 04/27/16 09:10, Kyrill Tkachov wrote:
2016-04-27 Kyrylo Tkachov
* config/aarch64/aarch64.md (ashl3, SHORT modes):
Use const_int_operand for operand 2 predicate. Simplify expand code
as a result.
LGTM
--
Evandro Menezes
On 04/27/16 09:23, James Greenhalgh wrote:
On Tue, Apr 12, 2016 at 01:14:51PM -0500, Evandro Menezes wrote:
On 04/05/16 17:30, Evandro Menezes wrote:
On 04/05/16 13:37, Wilco Dijkstra wrote:
I can't get any of these to work... Not only do I get a large
number of collisions and duplicated
ut so are users to use it through the
command line option -mlow-precision-div.
--
Evandro Menezes
On 04/26/16 11:14, Wilco Dijkstra wrote:
Evandro Menezes wrote:
True, but the results when running on A53 could be quite different.
GCC is ~1.2% faster on Cortex-A53 built for generic, but there is no
difference in perlbench.
Looks good, then. Fine by me.
Thanks for your patience
On 04/25/16 14:58, Wilco Dijkstra wrote:
Evandro Menezes wrote:
I agree with your assessment, but I'm more curious to understand how
this change affects code built with the default -mcpu=generic when run
on both A53 and A57, the typical configuration of big.LITTLE machines.
I wouldn
On 04/25/16 14:21, Wilco Dijkstra wrote:
Evandro Menezes wrote:
I assume that you mean that such improvements are true for
-mcpu=generic, yes? On which target, A53 or A57 or other?
It's true for any CPU setting. The SPEC results are for Cortex-A57
however I wrote a microbenchmark that
On 03/10/16 10:37, James Greenhalgh wrote:
On Thu, Mar 10, 2016 at 10:32:15AM -0600, Evandro Menezes wrote:
I agree to postpone until GCC 7.
[AArch64] Replace insn to zero up SIMD registers
gcc/
* config/aarch64/aarch64.md
(*movhf_aarch64): Add "mo
On 04/22/16 10:35, Wilco Dijkstra wrote:
OK for trunk?
LGTM
--
Evandro Menezes
On 04/21/16 03:15, Kyrill Tkachov wrote:
Ok to commit?
LGTM
--
Evandro Menezes
assume that you mean that such improvements are true for
-mcpu=generic, yes? On which target, A53 or A57 or other?
Otherwise, it seems to be a sensible change, but I'm trying to
understand how generally beneficial it is.
Thank you,
--
Evandro Menezes
> On 04/04/16 11:13, Evandro Menezes wrote:
> > On 04/01/16 18:08, Wilco Dijkstra wrote:
> >> Evandro Menezes wrote:
> >>> I hope that this gets in the ballpark of what's been discussed
> >>> previously.
> >> Yes that's very close to wh
> On 04/05/16 17:30, Evandro Menezes wrote:
> > On 04/05/16 13:37, Wilco Dijkstra wrote:
> >> I can't get any of these to work... Not only do I get a large number
> >> of collisions and duplicated code between these patches, when I try
> >> to resolve the
> On 04/04/16 14:06, Evandro Menezes wrote:
> > On 04/01/16 17:52, Evandro Menezes wrote:
> >> On 04/01/16 17:45, Wilco Dijkstra wrote:
> >>> Evandro Menezes wrote:
> >>>
> >>>> However, I don't think that there's the need to h
On 04/05/16 17:30, Evandro Menezes wrote:
On 04/05/16 13:37, Wilco Dijkstra wrote:
I can't get any of these to work... Not only do I get a large number
of collisions and duplicated
code between these patches, when I try to resolve them, all I get is
crashes whenever I try
to use sqrt
On 04/04/16 11:13, Evandro Menezes wrote:
On 04/01/16 18:08, Wilco Dijkstra wrote:
Evandro Menezes wrote:
I hope that this gets in the ballpark of what's been discussed
previously.
Yes that's very close to what I had in mind. A minor issue is that
the vector
modes cannot work as
On 04/04/16 14:06, Evandro Menezes wrote:
On 04/01/16 17:52, Evandro Menezes wrote:
On 04/01/16 17:45, Wilco Dijkstra wrote:
Evandro Menezes wrote:
However, I don't think that there's the need to handle any special
case
for division. The only case when the approximation di
ave a patchset that applies
cleanly so I can
try all approximation routines?
Hi, Wilco.
The original patches should be independent of each other, so indeed they
duplicate code.
This patch suite should be suitable for testing.
HTH
--
Evandro Menezes
>From cbc2b62f7df5c3e2fef2a24157b1bdd1a6de191b
On 04/01/16 17:52, Evandro Menezes wrote:
On 04/01/16 17:45, Wilco Dijkstra wrote:
Evandro Menezes wrote:
However, I don't think that there's the need to handle any special case
for division. The only case when the approximation differs from
division is when the numerator is infini
On 04/01/16 17:45, Evandro Menezes wrote:
On 03/24/16 14:11, Evandro Menezes wrote:
On 03/17/16 17:46, Evandro Menezes wrote:
This patch refactors the function to emit the reciprocal square root
approximation to also emit the square root approximation.
This version of the patch cleans up the
On 04/01/16 18:08, Wilco Dijkstra wrote:
Evandro Menezes wrote:
I hope that this gets in the ballpark of what's been discussed previously.
Yes that's very close to what I had in mind. A minor issue is that the vector
modes cannot work as they start at MAX_MODE_FLOAT (whi
On 04/01/16 17:45, Wilco Dijkstra wrote:
Evandro Menezes wrote:
However, I don't think that there's the need to handle any special case
for division. The only case when the approximation differs from
division is when the numerator is infinity and the denominator, zero,
when the app
On 03/24/16 14:11, Evandro Menezes wrote:
On 03/17/16 17:46, Evandro Menezes wrote:
This patch refactors the function to emit the reciprocal square root
approximation to also emit the square root approximation.
This version of the patch cleans up the changes to the MD files and
fixes some bugs
On 04/01/16 16:22, Wilco Dijkstra wrote:
Evandro Menezes wrote:
The division variant should use the same latency reduction trick I mentioned
for sqrt.
I don't think that it applies here, since it doesn't have to deal with
special cases.
No it applies as it's exactly the same
On 03/31/16 04:52, James Greenhalgh wrote:
On Wed, Mar 30, 2016 at 11:18:27AM -0500, Evandro Menezes wrote:
Add scalar 0.0 to the aarch64_simd_reg_or_zero predicate.
2016-03-30 Evandro Menezes
* gcc/config/aarch64/predicates.md
(aarch64_simd_reg_or_zero predicate
On 04/01/16 08:58, Wilco Dijkstra wrote:
Evandro Menezes wrote:
On 03/23/16 11:24, Evandro Menezes wrote:
On 03/17/16 15:09, Evandro Menezes wrote:
This patch implements FP division by an approximation using the Newton
series.
With this patch, DF division is sped up by over 100% and SF
ument for the mode.
This patch allows a target to choose the mode of this operation when it
is beneficial to use the approximate version.
I hope that this gets in the ballpark of what's been discussed previously.
Thank you,
--
Evandro Menezes
>From 17ac33719bae8966a481cc833c9ac06
On 04/01/16 09:06, James Greenhalgh wrote:
On Fri, Apr 01, 2016 at 02:47:05PM +0100, Wilco Dijkstra wrote:
Evandro Menezes wrote:
Ping^1
I haven't seen a newer version that incorporates my feedback. To recap what
I'd like to see is a more general way to select approximations based
On 04/01/16 08:47, Wilco Dijkstra wrote:
Evandro Menezes wrote:
Ping^1
I haven't seen a newer version that incorporates my feedback. To recap what
I'd like to see is a more general way to select approximations based on mode.
I don't believe that looking at the inner mode works
On 03/16/16 14:48, Evandro Menezes wrote:
On 02/03/16 13:46, Evandro Menezes wrote:
On 01/08/16 16:55, Evandro Menezes wrote:
On 12/16/2015 02:11 PM, Evandro Menezes wrote:
On 12/16/2015 05:24 AM, Richard Earnshaw (lists) wrote:
On 15/12/15 23:34, Evandro Menezes wrote:
On 12/14/2015 05:26
On 03/18/16 18:00, Evandro Menezes wrote:
On 03/18/16 17:20, Wilco Dijkstra wrote:
Evandro Menezes wrote:
On 03/18/16 10:21, Wilco Dijkstra wrote:
Hi Evandro,
For example, though this approximation is improves the performance
noticeably for DF on A57, for SF, not so much, if at all.
I
On 03/23/16 11:24, Evandro Menezes wrote:
On 03/17/16 15:09, Evandro Menezes wrote:
This patch implements FP division by an approximation using the Newton
series.
With this patch, DF division is sped up by over 100% and SF division,
zilch, both on A57 and on M1.
gcc
Add scalar 0.0 to the aarch64_simd_reg_or_zero predicate.
2016-03-30 Evandro Menezes
* gcc/config/aarch64/predicates.md
(aarch64_simd_reg_or_zero predicate): Add the "const_double"
constraint.
It seems to me that the aarch64_simd_reg_or_zero should also
On 03/17/16 17:46, Evandro Menezes wrote:
This patch refactors the function to emit the reciprocal square root
approximation to also emit the square root approximation.
2016-03-23 Evandro Menezes
Wilco Dijkstra
gcc/
* config/aarch64/aarch64-tuning
On 03/17/16 15:09, Evandro Menezes wrote:
This patch implements FP division by an approximation using the Newton
series.
With this patch, DF division is sped up by over 100% and SF division,
zilch, both on A57 and on M1.
gcc/
* config/aarch64/aarch64-tuning-flags.def
On 03/17/16 15:09, Evandro Menezes wrote:
This patch implements FP division by an approximation using the Newton
series.
With this patch, DF division is sped up by over 100% and SF division,
zilch, both on A57 and on M1.
gcc/
* config/aarch64/aarch64-tuning-flags.def
Emit division using the Newton series
2016-03-17 Evandro Menezes
gcc/
* config/aarch64/aarch64-tuning-flags.def
(AARCH64_EXTRA_TUNE_APPROX_DIV_{SF,DF}: New tuning macros.
* config/aarch64/aarch64-protos.h
On 03/17/16 09:55, James Greenhalgh wrote:
On Wed, Mar 16, 2016 at 02:45:37PM -0500, Evandro Menezes wrote:
On 03/08/16 16:08, Evandro Menezes wrote:
On 02/16/16 14:56, Evandro Menezes wrote:
On 12/08/15 15:35, Evandro Menezes wrote:
Emit square root using the Newton series
2015-12-03
, not so much, if at all.
Feedback appreciated.
Thank you,
--
Evandro Menezes
>From 95581aefcf324233c3603f4d8232ee18c5836f8a Mon Sep 17 00:00:00 2001
From: Evandro Menezes
Date: Thu, 17 Mar 2016 17:00:03 -0500
Subject: [PATCH] Add precision choices for the reciprocal square root
approximat
2016-03-16 Evandro Menezes
Wilco Dijkstra
gcc/
* config/aarch64/aarch64-tuning-flags.def
(AARCH64_EXTRA_TUNE_APPROX_SQRT_{SF,DF}): New tuning macros.
* config/aarch64/aarch64-protos.h
(aarch64_emit_approx_rsqrt): Replace with
, not so much, if at all.
Feedback appreciated.
Thank you,
--
Evandro Menezes
On 03/08/16 16:08, Evandro Menezes wrote:
On 02/16/16 14:56, Evandro Menezes wrote:
On 12/08/15 15:35, Evandro Menezes wrote:
Emit square root using the Newton series
2015-12-03 Evandro Menezes
gcc/
* config/aarch64/aarch64-protos.h (aarch64_emit_swsqrt):
Declare new
tion patch makes the decision in the md
file which
does not seem a good idea).
I agree. Will modify it.
Thank you,
--
Evandro Menezes
Tweak the pipeline model for Exynos M1
* gcc/config/aarch64/aarch64.c
(exynosm1_tunings): Enable the weak prefetching model.
Committed as r234307.
--
Evandro Menezes
>From a75d875a3c64180c9d6c368e2d87036d70f66036 Mon Sep 17 00:00:00 2001
From: evandro
D
On 03/10/16 19:06, Wilco Dijkstra wrote:
Evandro Menezes wrote:
That's what I had in mind too, but around the approximation for x^-1/2
and using masks for vector cases thusly:
fcmne v3.4s, v0.4s, #0.0
frsqrte v1.4s, v0.4s
fmulv2.4s, v1.4s,
On 03/18/16 17:20, Wilco Dijkstra wrote:
Evandro Menezes wrote:
On 03/18/16 10:21, Wilco Dijkstra wrote:
Hi Evandro,
For example, though this approximation is improves the performance
noticeably for DF on A57, for SF, not so much, if at all.
I'm still skeptical that you ever can ge
On 02/03/16 13:46, Evandro Menezes wrote:
On 01/08/16 16:55, Evandro Menezes wrote:
On 12/16/2015 02:11 PM, Evandro Menezes wrote:
On 12/16/2015 05:24 AM, Richard Earnshaw (lists) wrote:
On 15/12/15 23:34, Evandro Menezes wrote:
On 12/14/2015 05:26 AM, James Greenhalgh wrote:
On Thu, Dec 03
On 03/10/16 19:06, Wilco Dijkstra wrote:
Evandro Menezes wrote:
That's what I had in mind too, but around the approximation for x^-1/2
and using masks for vector cases thusly:
fcmne v3.4s, v0.4s, #0.0
frsqrte v1.4s, v0.4s
fmulv2.4s, v1.4s,
fmulv2.4s, v1.4s, v1.4s
frsqrts v2.4s, v0.4s, v2.4s
fmulv1.4s, v1.4s, v2.4s
and v1.4s, v3.4s
fmulv0.4s, v1.4s, v0.4s
Thanks,
--
Evandro Menezes
han it is today.
Thanks for the pointer, Wilco. Will work it in the patch.
--
Evandro Menezes
On 03/10/16 10:27, Evandro Menezes wrote:
On 03/10/16 07:23, James Greenhalgh wrote:
On Wed, Mar 09, 2016 at 03:35:43PM -0600, Evandro Menezes wrote:
On 03/01/16 13:08, Evandro Menezes wrote:
On 03/01/16 13:02, Wilco Dijkstra wrote:
Evandro Menezes wrote:
The meaning of these attributes are
On 03/10/16 07:23, James Greenhalgh wrote:
On Wed, Mar 09, 2016 at 03:35:43PM -0600, Evandro Menezes wrote:
On 03/01/16 13:08, Evandro Menezes wrote:
On 03/01/16 13:02, Wilco Dijkstra wrote:
Evandro Menezes wrote:
The meaning of these attributes are not clear to me. Is there a
reference
On 03/01/16 13:08, Evandro Menezes wrote:
On 03/01/16 13:02, Wilco Dijkstra wrote:
Evandro Menezes wrote:
The meaning of these attributes are not clear to me. Is there a
reference somewhere about which insns are FP or SIMD or neither?
The meaning should be clear, "fp" is a floa
On 03/08/16 16:08, Evandro Menezes wrote:
On 02/16/16 14:56, Evandro Menezes wrote:
On 12/08/15 15:35, Evandro Menezes wrote:
Emit square root using the Newton series
2015-12-03 Evandro Menezes
gcc/
* config/aarch64/aarch64-protos.h (aarch64_emit_swsqrt):
Declare new
On 03/08/16 16:08, Evandro Menezes wrote:
On 02/16/16 14:56, Evandro Menezes wrote:
On 12/08/15 15:35, Evandro Menezes wrote:
Emit square root using the Newton series
2015-12-03 Evandro Menezes
gcc/
* config/aarch64/aarch64-protos.h (aarch64_emit_swsqrt):
Declare new
On 02/16/16 14:56, Evandro Menezes wrote:
On 12/08/15 15:35, Evandro Menezes wrote:
Emit square root using the Newton series
2015-12-03 Evandro Menezes
gcc/
* config/aarch64/aarch64-protos.h (aarch64_emit_swsqrt):
Declare new
function.
* config
On 02/16/16 14:56, Evandro Menezes wrote:
On 12/08/15 15:35, Evandro Menezes wrote:
Emit square root using the Newton series
2015-12-03 Evandro Menezes
gcc/
* config/aarch64/aarch64-protos.h (aarch64_emit_swsqrt):
Declare new
function.
* config
On 03/01/16 13:02, Wilco Dijkstra wrote:
Evandro Menezes wrote:
The meaning of these attributes are not clear to me. Is there a
reference somewhere about which insns are FP or SIMD or neither?
The meaning should be clear, "fp" is a floating point instruction, "simd" a
SI
On 02/29/16 12:07, Wilco Dijkstra wrote:
Evandro Menezes wrote:
Please, verify the new "simd" and "fp" attributes for SF and DF.
Both movsf and movdf should be:
(set_attr "simd" "*,yes,*,*,*,*,*,*,*,*")
(set_attr "fp" "*,*,*,yes,yes
On 02/26/16 17:42, Evandro Menezes wrote:
On 02/26/16 08:59, James Greenhalgh wrote:
On Mon, Feb 22, 2016 at 06:50:44PM -0600, Evandro Menezes wrote:
In preparation for the patch adding the Newton series also for
square root, I'd like to propose this patch changing the name of the
exi
On 02/26/16 08:59, James Greenhalgh wrote:
On Mon, Feb 22, 2016 at 06:50:44PM -0600, Evandro Menezes wrote:
In preparation for the patch adding the Newton series also for
square root, I'd like to propose this patch changing the name of the
existing tuning flag for the reciprocal square
On 02/26/16 06:37, Wilco Dijkstra wrote:
Evandro Menezes wrote:
I have a question though: is it necessary to add the "fp" and "simd"
attributes to both movsf_aarch64 and movdf_aarch64 as well?
You need at least the "simd" attribute, but providing "fp&qu
1 - 100 of 194 matches
Mail list logo