On 04/04/16 11:13, Evandro Menezes wrote:
On 04/01/16 18:08, Wilco Dijkstra wrote:
Evandro Menezes wrote:
I hope that this gets in the ballpark of what's been discussed
previously.
Yes that's very close to what I had in mind. A minor issue is that
the vector
modes cannot work as they start at MAX_MODE_FLOAT (which is > 32):
+/* Control approximate alternatives to certain FP operators. */
+#define AARCH64_APPROX_MODE(MODE) \
+ ((MIN_MODE_FLOAT <= (MODE) && (MODE) <= MAX_MODE_FLOAT) \
+ ? (1 << ((MODE) - MIN_MODE_FLOAT)) \
+ : (MIN_MODE_VECTOR_FLOAT <= (MODE) && (MODE) <=
MAX_MODE_VECTOR_FLOAT) \
+ ? (1 << ((MODE) - MIN_MODE_VECTOR_FLOAT + MAX_MODE_FLOAT + 1)) \
+ : (0))
That should be:
+ ? (1 << ((MODE) - MIN_MODE_VECTOR_FLOAT + MAX_MODE_FLOAT -
MIN_MODE_FLOAT + 1)) \
It would be worth testing all the obvious cases to be sure they work.
Also I don't think it is a good idea to enable all modes on Exynos-M1
and XGene-1 -
I haven't seen any evidence that shows it gives a speedup on real
code for all modes
(or at least on a good micro benchmark like the unit vector test I
suggested - a simple
throughput test does not count!).
This approximation does benefit M1 in general across several
benchmarks. As for my choice for Xgene1, it preserves the original
setting. I believe that, with this more granular option, developers
can fine tune their targets.
The issue is it hides performance gains from an improved divider/sqrt
on new revisions
or microarchitectures. That means you should only enable cases where
there is evidence
of a major speedup that cannot be matched by a future improved
divider/sqrt.
I did notice that some benchmarks with heavy use of multiplication or
multiply-accumulation, the series may be detrimental, since it
increases the competition for the unit(s) that do(es) such operations.
But those micro-architectures that get a better unit for division or
sqrt() are free to add their own tuning parameters. Granted, I assume
that running legacy code is not much of an issue only in a few markets.
Ping^1
--
Evandro Menezes