https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78020
Bug ID: 78020 Summary: [Aarch64, ARM64] vuzp{1,2}q_f64 implementation identical to vzip{1,2}q_f64 in arm_neon.h and probably incorrect Product: gcc Version: 6.1.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: christophe.monat at st dot com Target Milestone: --- Created attachment 39829 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=39829&action=edit Suggested patch to fix the zip vs unzp intrinsic issue I think that vuzp{1,2}q_f64 in gcc/config/aarch64/arm_neon.h are not correctly implemented. For instance: vzip1q_f64 (float64x2_t __a, float64x2_t __b) (snippage) return __builtin_shuffle (__a, __b, (uint64x2_t) {0, 2}); and: vuzp1q_f64 (float64x2_t __a, float64x2_t __b) (snippage) return __builtin_shuffle (__a, __b, (uint64x2_t) {0, 2}); But then, according to the "ARM Architecture reference Manual, ARMv8 for ARMv8-A architecture profile" (I am reading ARM DDI 0487A.i (ID 012816)), the semantic of zip1 and uzp1 differ (C3.5.18 is a convenient starting point to browse the architectural descriptions). It looks to me that the correct implementation would look like: vuzp1q_f64 (float64x2_t __a, float64x2_t __b) (snippage) return __builtin_shuffle (__a, __b, (uint64x2_t) {2, 0}); that would generate zip1 v0.2d, v1.2d, v0.2d instead of: zip1 v0.2d, v0.2d, v1.2d and has the correct semantic, though it does not use the uzp1 mnemonic (I expected this uzp1 to appear more or less at the beginning, and scratched a little bit my head and draw some diagrams to convince me that I was hopefully correct). I have also noticed that vtrn{1,2}q_f64 are implemented in terms of zip{1,2}, but it seems ok semantically (though I had to check manually with the semantic description and more drawings to convince myself). I have attached a patch suggesting a change in arm_neon.h, in case this analysis is correct. I note that if https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70369 were completed the occurrence of such issue would be less likely.