[EMAIL PROTECTED] writes:

> void longcpy(long* _dst, long* _src, unsigned _numwords)
> {
>     asm volatile (
>         "cld         \n\t"
>         "rep         \n\t"
>         "movsl       \n\t"
>       // Outputs
>         :
>       // Inputs
>         : "S" (_src), "D" (_dst), "c" (_numwords)
>       // Clobbers
>         : "cc", "memory"
>         );
> }
> 
> My interpretation of the problem:
> 
> _dst, _src and _numwords will get clobbered, but I didn't
> care. Now if the compiler inlines the function, and later re-uses the
> register-cached values assuming them to be intact, then it all goes
> horribly wrong.
> 
> But, if I specify the outputs like this:
> 
>       // Outputs
>         : "=&S" (_src), "=&D" (_dst), "=&c" (_numwords)
> 
> the the compiler is warned that the registers are clobbered and now
> contain some (undefined and unused) return values, and won't expect
> _src, _dst and _numwords to be intact in esi, edi, ecx.

Probably better to say that these are read-write operands, using the
'+' constraint.

> Now everything works fine at -O3. However, I really don't understand
> the '&' early clobber constraint modifer. What use is it?

It is needed for assembly code which has both outputs and inputs, and
which includes more than one instruction, such that at least one of
the outputs is generated by an instruction which runs before another
instruction which requires one of the inputs.  The '&' constraint
tells gcc that some of the output operands are produced before some of
the input operands are used.  gcc will then avoid allocating the input
and output operands to the same register.

Ian

Reply via email to