https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65078
Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |uros at gcc dot gnu.org

--- Comment #9 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
So, in *.optimized the changes are just 16 times a difference like:
-  _62 = __builtin_ia32_vec_ext_v2di (_63, 0);
+  _62 = BIT_FIELD_REF <_63, 64, 0>;
And during expansion, the difference is:
-;; _62 = __builtin_ia32_vec_ext_v2di (_63, 0);
-
-(insn 42 41 43 (set (reg:V2DI 329)
-        (subreg:V2DI (reg:V16QI 138 [ D.4823 ]) 0)) ./include/emmintrin.h:722
-1
-     (nil))
-
-(insn 43 42 44 (set (reg:DI 330)
-        (vec_select:DI (reg:V2DI 329)
-            (parallel [
-                    (const_int 0 [0])
-                ]))) ./include/emmintrin.h:722 -1
-     (nil))
-
-(insn 44 43 0 (set (reg:DI 136 [ D.4825 ])
-        (reg:DI 330)) ./include/emmintrin.h:722 -1
-      (nil))
-
-;; MEM[(long long int *)dest_268] = _62;
-
-(insn 45 44 0 (set (mem:DI (reg/v/f:SI 317 [ dest ]) [3 MEM[(long long int
*)dest_268]+0 S8 A64])
-        (reg:DI 136 [ D.4825 ])) ./include/emmintrin.h:722 -1
-      (nil))
+;; MEM[(long long int *)dest_268] = _62;
+ 
+(insn 42 41 43 (set (reg:TI 329)
+        (subreg:TI (reg:V16QI 138 [ D.4825 ]) 0)) ./include/emmintrin.h:722 -1
+      (nil))
+(insn 43 42 0 (set (mem:DI (reg/v/f:SI 317 [ dest ]) [3 MEM[(long long int
*)dest_268]+0 S8 A64])
+        (subreg:DI (reg:TI 329) 0)) ./include/emmintrin.h:722 -1
+      (nil))

With the new storel_epi64 we get before RA:
(insn 43 40 44 3 (set (mem:DI (reg/v/f:SI 317 [ dest ]) [3 MEM[(long long int
*)dest_268]+0 S8 A64])
        (subreg:DI (reg:V16QI 328) 0)) ./include/emmintrin.h:722 89
{*movdi_internal}
     (expr_list:REG_DEAD (reg:V16QI 328)
        (nil)))
out of this, and not surprisingly the RA reloads it by storing the V16QI 328
into stack and loads back a DImode value, while with the old intrinsic before
RA we have:
(insn 45 43 46 3 (set (mem:DI (reg/v/f:SI 317 [ dest ]) [3 MEM[(long long int
*)dest_268]+0 S8 A64])
        (vec_select:DI (subreg:V2DI (reg:V16QI 328) 0)
            (parallel [
                    (const_int 0 [0])
                ]))) ./include/emmintrin.h:722 3660 {*vec_extractv2di_0_sse}
     (expr_list:REG_DEAD (reg:V16QI 328)
        (nil)))
and don't need to spill that.  Now the question is if we can tell RA somehow
(secondary reload) that to get a DImode lowpart subreg (and SImode too?) out of
a vector register it can use the *vec_extractv2di_0_sse instruction for that.
Or add !TARGET_64BIT pattern for storing a DImode lowpart subreg of a vector
register (any mode there?) into memory?  Or ensure that the BIT_FIELD_REF is
expanded as the builtin used to be.

Reply via email to