The following test case gets miscompiled and fails when built with "-O -maltivec -mabi=altivec -include altivec.h" on GCC 4.1:
extern void abort (void); void test (vector float *p, int n) { int i; for (i = 0; i < n; i++) p[i] = vec_abs (p[i]); } int main (void) { vector float p = (vector float){ 0.5, 0.5, 0.5, 0.5 }; vector float q = p; test (&p, 1); if (memcmp (&p, &q, sizeof (p))) abort (); return 0; } The reason for this appears to be an abuse of RTL semantics by the "altivec_vspltisw_v4sf" pattern: (define_insn "altivec_vspltisw_v4sf" [(set (match_operand:V4SF 0 "register_operand" "=v") (vec_duplicate:V4SF (float:SF (match_operand:QI 1 "s5bit_cint_operand" "i"))))] "TARGET_ALTIVEC" "vspltisw %0,%1" [(set_attr "type" "vecperm")]) What this instruction does is to load an immediate *integer* value into a vector register, which happens to be re-interpreted as a floating point value (without changing the bit pattern). What the RTL associated with the pattern *says*, however, is to load the integer, *converted to floating point*, into the register, which is a quite different semantics. Now, since the pattern is only explicitly generated from within other expanders inside altivec.md (apparently), all of which expect the semantics of the actual Altivec instruction, not the semantics as literally specified in the RTL, those misinterpretations generally cancel each other and generated code behaves as expected. However, as soon as the middle-end gets an opportunity to run the RTL through the simplifier, everything breaks. This happens in particular when the load is being hoisted out of a loop due to being loop-invariant, as in the above test case: the vec_abs pattern expands via this expander ;; Generate ;; vspltisw SCRATCH1,-1 ;; vslw SCRATCH2,SCRATCH1,SCRATCH1 ;; vandc %0,%1,SCRATCH2 (define_expand "absv4sf2" [(set (match_dup 2) (vec_duplicate:V4SF (float:SF (const_int -1)))) (set (match_dup 3) (unspec:V4SF [(match_dup 2) (match_dup 2)] UNSPEC_VSLW)) (set (match_operand:V4SF 0 "register_operand" "=v") (and:V4SF (not:V4SF (match_dup 3)) (match_operand:V4SF 1 "register_operand" "v")))] "TARGET_ALTIVEC" { operands[2] = gen_reg_rtx (V4SFmode); operands[3] = gen_reg_rtx (V4SFmode); }) and the first two insns are in fact loop invariant. The problem in this particular test case is a regression in GCC 4.1, introduced by this patch: [PATCH] Improve scheduling of Altivec absolute value patterns http://gcc.gnu.org/ml/gcc-patches/2005-02/msg01195.html -(define_insn "absv4sf2" - [(set (match_operand:V4SF 0 "register_operand" "=v") - (abs:V4SF (match_operand:V4SF 1 "register_operand" "v"))) - (clobber (match_scratch:V4SF 2 "=&v")) - (clobber (match_scratch:V4SF 3 "=&v"))] +;; Generate +;; vspltisw SCRATCH1,-1 +;; vslw SCRATCH2,SCRATCH1,SCRATCH1 +;; vandc %0,%1,SCRATCH2 +(define_expand "absv4sf2" + [(set (match_dup 2) + (vec_duplicate:V4SF (float:SF (const_int -1)))) + (set (match_dup 3) + (unspec:V4SF [(match_dup 2) (match_dup 2)] UNSPEC_VSLW)) + (set (match_operand:V4SF 0 "register_operand" "=v") + (and:V4SF (not:V4SF (match_dup 3)) + (match_operand:V4SF 1 "register_operand" "v")))] "TARGET_ALTIVEC" - "vspltisw %2,-1\;vslw %3,%2,%2\;vandc %0,%1,%3" - [(set_attr "type" "vecsimple") - (set_attr "length" "12")]) +{ + operands[2] = gen_reg_rtx (V4SFmode); + operands[3] = gen_reg_rtx (V4SFmode); +}) However, the underlying abuse of RTL semantics when describing the vspltisw instruction in V4SFmode apparently pre-dates this patch. The easiest way to fix this would appear to use an UNSPEC to describe the insn semantics. Any better idea? -- Summary: Miscompile of Altivec vec_abs (float) inside loop Product: gcc Version: 4.1.0 Status: UNCONFIRMED Keywords: wrong-code Severity: normal Priority: P3 Component: target AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: uweigand at gcc dot gnu dot org GCC target triplet: powerpc*-*-* http://gcc.gnu.org/bugzilla/show_bug.cgi?id=27842