This volatile_spec tells the compiler it does not touch any of the registers so ira and reload can insert its instructions in either place. Lying to reload is bad news.

Sent from my iPhone

On May 29, 2010, at 8:26 AM, "H.J. Lu" <hjl.to...@gmail.com> wrote:

On Fri, May 28, 2010 at 9:08 PM, Vladimir N. Makarov
<vmaka...@redhat.com> wrote:
On 05/28/2010 12:38 PM, H.J. Lu wrote:

Hi,

I want to generate vzeroupper when I know upper 128bits aren't used. I
can't find
a way to mark an pattern which zeros upper 128bits. So I added

;; Clear the upper 128bits of AVX registers, equivalent to a NOP.
;; This should be used only when the upper 128bits are unused.
(define_insn "avx_vzeroupper_nop"
  [(unspec_volatile [(const_int 0)] UNSPECV_VZEROUPPER)]
  "TARGET_AVX"
  "vzeroupper"
  [(set_attr "type" "sse")
   (set_attr "modrm" "0")
   (set_attr "memory" "none")
   (set_attr "prefix" "vex")
   (set_attr "mode" "OI")])

For this simple code,

---
typedef float __m256 __attribute__ ((__vector_size__ (32),
         __may_alias__));

extern __m256 x, z;
extern void bar2 (void);

int
foo (__m256 y)
{
  bar2 ();
  z = y;
  return 0;
}
---

before IRA,

(insn 2 4 3 2 x.i:9 (set (reg/v:V8SF 59 [ y ])
        (reg:V8SF 21 xmm0 [ y ])) 1036 {*avx_movv8sf_internal}
(expr_list:REG_DEAD (reg:V8SF 21 xmm0 [ y ])
        (nil)))

(note 3 2 6 2 NOTE_INSN_FUNCTION_BEG)

(insn 6 3 7 2 x.i:10 (unspec_volatile [
            (const_int 0 [0])
        ] 17) 1960 {avx_vzeroupper_nop} (nil))

(call_insn 7 6 8 2 x.i:10 (call (mem:QI (symbol_ref:DI ("bar2") [flags
0x41]<function_decl 0x7ffa930ecd00 bar2>) [0 S1 A8])
        (const_int 0 [0])) 599 {*call_0} (nil)
    (nil))


after IRA,

(insn 6 3 20 2 x.i:10 (unspec_volatile [
            (const_int 0 [0])
        ] 17) 1960 {avx_vzeroupper_nop} (nil))

(insn 20 6 7 2 x.i:10 (set (mem/c:V8SF (reg/f:DI 7 sp) [3 S32 A256])
        (reg:V8SF 21 xmm0)) 1036 {*avx_movv8sf_internal} (nil))

(call_insn 7 20 21 2 x.i:10 (call (mem:QI (symbol_ref:DI ("bar2")
[flags 0x41]<function_decl 0x7ffa930ecd00 bar2>) [0 S1 A8])
        (const_int 0 [0])) 599 {*call_0} (nil)
    (nil))

Since vzeroupper will change xmm0/ymm0, the value saved on stack is wrong.
Is that a way to tell IRA not to move an instruction?



I think IRA itself is not responsible for this. This is probably reload more accurately caller-saves.c. The result of reload is in ira dump so it
only looks that it is IRA.

Insn 20 probably is generated by caller-saves.c. I can not find special treatment of unspec_volatile as changing all registers in caller- saves.c. So I think here is the problem. But this is only my speculations, some investigation should be done to be sure (e.g. where is insn 20 generated).



XMM0 is caller-saved. insn 20 came from insn 2 which saves XMM0
onto stack. IRA/reload wants to do it right before call at -O2. I opened a
bug:

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44323

--
H.J.

Reply via email to