[PATCH] D53633: [AArch64] Implement FP16FML intrinsics

Ahmed Bougacha via Phabricator via cfe-commits Fri, 15 Feb 2019 14:12:50 -0800

ab added inline comments.


================
Comment at: cfe/trunk/test/CodeGen/aarch64-neon-fp16fml.c:12
+
+float32x2_t test_vfmlal_low_u32(float32x2_t a, float16x4_t b, float16x4_t c) {
+// CHECK-LABEL: define <2 x float> @test_vfmlal_low_u32(<2 x float> %a, <4 x 
half> %b, <4 x half> %c)
----------------
SjoerdMeijer wrote:
> SjoerdMeijer wrote:
> > ab wrote:
> > > Hey folks, I'm curious: where does the "_u32" suffix come from? Should it 
> > > be _f16?
> > > 
> > > Also, are there any new ACLE/intrinsic list documents? As far as I can 
> > > tell there hasn't been any release since IHI0073B/IHI0053D.
> > > Also, are there any new ACLE/intrinsic list documents? As far as I can 
> > > tell there hasn't been any release since IHI0073B/IHI0053D.
> > 
> > I've checked, and an updated ACLE that includes these FP16FML intrinsics is 
> > coming soon.
> > 
> > > where does the "_u32" suffix come from? Should it be _f16?
> > 
> > Good question. It could probably be _f32 or _f16, but _u32 doesn't seem to 
> > make much sense. Looks like the spec says _u32, and that's also what GCC 
> > has implemented. I think we want to update the spec and fix the name before 
> > the updated spec is available. Will chase this, and let you know once I 
> > know more.
> An update on this: we should change this to _f32 (because the first suffixes 
> were refering to the ouput type). The ACLE will be updated accordingly, and 
> also GCC will change its current implementation (from _u32 to _f32).  Many 
> thanks for raising this issue.
> Is there a volunteer to prepare a patch? Or do you have one already? :-) I 
> could look at it, but that will be towards the end of next week.
> I've checked, and an updated ACLE that includes these FP16FML intrinsics is 
> coming soon.

Great, thanks!

> An update on this: we should change this to _f32 (because the first suffixes 
> were refering to the ouput type).

Hmm, I was thinking _f16 based on the vmlal intrinsics: they seem to be named 
after the multiplication type rather than that of the accumulator/output.

Either way seems fine to me though, I'll defer to you folks.

> The ACLE will be updated accordingly, and also GCC will change its current 
> implementation (from _u32 to _f32). Many thanks for raising this issue.
Is there a volunteer to prepare a patch? Or do you have one already? :-) I 
could look at it, but that will be towards the end of next week.

Sure: D58306 (with _f16 though, let me know what you think of vmlal)

Thanks for checking!


Repository:
  rL LLVM

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D53633/new/

https://reviews.llvm.org/D53633



_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D53633: [AArch64] Implement FP16FML intrinsics

Reply via email to