On 18/05/16 01:58, Joseph Myers wrote:
On Tue, 17 May 2016, Matthew Wahab wrote:
As with the VFP FP16 arithmetic instructions, operations on __fp16
values are done by conversion to single-precision. Any new optimization
supported by the instruction descriptions can only apply to code
generated using intrinsics added in this patch series.
As with the scalar instructions, I think it is legitimate in most cases to
optimize arithmetic via single precision to work direct on __fp16 values
(and this would be natural for vectorization of __fp16 arithmetic).
Hi Josephy,
Currently for vector types like v4hf, there is not type promotion, it
will live
on arm until it reaches vector lower pass where it's splitted into hf
operations, then
these hf operations will be widened into sf operation during rtl expand
as we don't have
scalar hf support on standard patterns.
Then,
* if we add scalar HF mode to standard patterns, vector HF modes
operation will be
turned into scalar HF operations instead of scalar SF operations.
* if we add vector HF mode to standard patterns, vector HF modes
operations will
generate vector HF instructions directly.
Will this still cause precision inconsistence with old gcc when there
are cascade
vector float operations?
Thanks
Regards,
Jiong