[Bug tree-optimization/67577] Trivial float-vectorization foiled by a loop

pinskia at gcc dot gnu.org Thu, 17 Dec 2015 17:20:36 -0800

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67577


Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
          Component|rtl-optimization            |tree-optimization

--- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
For aarch64-linux-gnu on the trunk (GCC 6), we are able to produce the
vectorized code correctly:

        adrp    x1, .LANCHOR0
        add     x0, x1, :lo12:.LANCHOR0
        ldr     q0, [x1, #:lo12:.LANCHOR0]
        ldr     q1, [x0, 16]
        ldr     q4, [x0, 64]
        ldr     q3, [x0, 48]
        ldr     s2, [x0, 32]
        fsub    v4.4s, v4.4s, v1.4s
        fsub    v3.4s, v3.4s, v0.4s
        dup     v2.4s, v2.s[0]
        fmla    v1.4s, v2.4s, v4.4s
        fmla    v0.4s, v2.4s, v3.4s
        str     q1, [x0, 96]
        str     q0, [x0, 80]

[Bug tree-optimization/67577] Trivial float-vectorization foiled by a loop

Reply via email to