On Thu, May 12, 2011 at 06:11:59PM +0200, Piotr Wyderski wrote: > Unfortunately, onx86/x64 both are compiled in a rather poor way: > > __sync_increment: > > lock addl $x01,(ptr) > > which is longer than: > > lock incl (ptr)
GCC actually generates lock incl (ptr) already now, it just depends on which CPU you optimize for. /* X86_TUNE_USE_INCDEC */ ~(m_PENT4 | m_NOCONA | m_CORE2I7 | m_GENERIC | m_ATOM), So, if you say -mtune=bdver1 or -mtune=k8, it will generate incl, if addl is better (e.g. on Atom incl is very bad compared to addl $1), it will generate it. Jakub