On Tue, Nov 19, 2013 at 03:29:12PM +0000, Mathieu Desnoyers wrote: > Hi, > > I got a bug report on ARM which appears to be caused by an aggressive > gcc optimisation starting from gcc 4.8.x due to lack of constraints on > the current_thread_info() inline assembly. The only logical > explanation for his issue I see so far is that read of the > preempt_count within might_sleep() is reordered with preempt_enable() > or preempt_disable(). AFAIU, this kind of issue also applies to other > architectures. > > First thing, preempt enable/disable only contains barrier() between > the inc/dec and the inside of the critical section, not the outside. > Therefore, we are relying on program order to ensure that the > preempt_count() read in might_sleep() is not reordered with the > preempt count inc/dec. > > However, looking at ARM arch/arm/include/asm/thread_info.h: > > static inline struct thread_info *current_thread_info(void) { register > unsigned long sp asm ("sp"); return (struct thread_info *)(sp & > ~(THREAD_SIZE - 1)); } > > The inline assembly has no clobber and is not volatile. (this is also > true for all other architectures I've looked at so far, which includes > x86 and powerpc) > > As long as someone does: > > struct thread_info *myinfo = current_thread_info(); > > load from myinfo->preempt_count; store to myinfo->preempt_count; > > The program order should be preserved, because the read and the write > are done wrt same base. However, what we have here in the case of > might_sleep() followed by preempt_enable() is: > > load from current_thread_info()->preempt_count; store to > current_thread_info()->preempt_count; > > Since each current_thread_info() is a different asm ("sp") without > clobber nor volatile, AFAIU, the compiler is within its right to > reorder them. > > One possible solution to this might be to add "memory" clobber and > volatile to this inline asm, but I fear it would put way too much > constraints on the compiler optimizations (too heavyweight). > > Have any of you experienced this issue ? Any thoughts on the matter ? > > I'm providing the original bug report email exchange below. I had > confirmation that adding barrier() outside > preempt_enable()/preempt_disable() does indeed work-around the issue, > but I fear that this work-around would be leaving the > current_thread_info() semantic gap in place: people expect program > order to be respected.
This all brings to mind commit 509eb76ebf977 ("ARM: 7747/1: pcpu: ensure __my_cpu_offset cannot be re-ordered across barrier()"). GCC seems overly happy with reordering asm() thingies. Ideally we'd be able to tell gcc that: register unsigned long *sp asm ("sp") not only acts like a variable but should be treated like a variable and thus such reshuffeling should be limited to preserve program-order. There doesn't seem to be any __attribute__() that can do this, and currently out only recourse seems to add either volatile, which seems overly strict as Mathieu said, or add fake input like Will did. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/