[EMAIL PROTECTED] writes: > void longcpy(long* _dst, long* _src, unsigned _numwords) > { > asm volatile ( > "cld \n\t" > "rep \n\t" > "movsl \n\t" > // Outputs > : > // Inputs > : "S" (_src), "D" (_dst), "c" (_numwords) > // Clobbers > : "cc", "memory" > ); > } > > My interpretation of the problem: > > _dst, _src and _numwords will get clobbered, but I didn't > care. Now if the compiler inlines the function, and later re-uses the > register-cached values assuming them to be intact, then it all goes > horribly wrong. > > But, if I specify the outputs like this: > > // Outputs > : "=&S" (_src), "=&D" (_dst), "=&c" (_numwords) > > the the compiler is warned that the registers are clobbered and now > contain some (undefined and unused) return values, and won't expect > _src, _dst and _numwords to be intact in esi, edi, ecx.
Probably better to say that these are read-write operands, using the '+' constraint. > Now everything works fine at -O3. However, I really don't understand > the '&' early clobber constraint modifer. What use is it? It is needed for assembly code which has both outputs and inputs, and which includes more than one instruction, such that at least one of the outputs is generated by an instruction which runs before another instruction which requires one of the inputs. The '&' constraint tells gcc that some of the output operands are produced before some of the input operands are used. gcc will then avoid allocating the input and output operands to the same register. Ian