The RDHWR instruction is used to support TLS on Linux/MIPS. For now it is always emulated by kernel (on Reserved Instruction exception handler), the instruction will be quite expensive.
If I compile this code with gcc 4.1.1 (-O2), extern __thread int x; int foo(int arg) { if (arg) return x; return 0; } I got this output. foo: .frame $sp,0,$31 # vars= 0, regs= 0/0, args= 0, gp= 0 .mask 0x00000000,0 .fmask 0x00000000,0 .set noreorder .cpload $25 .set nomacro lw $2,%gottprel(x)($28) .set push .set mips32r2 rdhwr $3,$29 .set pop addu $2,$2,$3 beq $4,$0,$L4 move $3,$0 lw $3,0($2) $L4: j $31 move $2,$3 The RDHWR is executed _before_ evaluating the "arg" value. For arg == 0 case, the RDHWR has no point but just a overhead. Without -O2, the RDHWR is executed _after_ the evaluation, so gcc's optimizer reorder the RDHWR instruction. I want to make the RDHWR called only if really required. Is it possible to prevent such a reording for RDHWR ? This is very common case in libc. On returning from system call, libc checks the result and write to the "errno" which is thread-local. On successful syscall, we do not need to access "errno" at all, but gcc sometimes put RDHWR _before_ checking the result of the syscall. This add an additional exception for the syscall for usual case. I do not know much about gcc's internal. Any help are welcome. Thank you. --- Atsushi Nemoto