Etienne Lorrain writes: > > The correct version is I think, > > > > void longcpy(long* _dst, long* _src, unsigned _numwords) > > { > > asm volatile ( > > "cld \n\t" > > "rep \n\t" > > "movsl \n\t" > > // Outputs (read/write) > > : "=S" (_src), "=D" (_dst), "=c" (_numwords) > > // Inputs - specify same registers as outputs > > : "0" (_src), "1" (_dst), "2" (_numwords) > > // Clobbers: direction flag, so "cc", and "memory" > > : "cc", "memory" > > ); > > } > > I did not re-check with GCC-4.1.1, but I noticed problems with this > kind of "memory" clobber: when the source you are copying from is > not in memory but (is a structure) in the stack. I have to say that > I tend to use a form without "volatile" after the asm (one of the > result has to be used then). > > The usual symtom is that the memcopy is done, but the *content* of the > source structure is not updated *before* the memcopy: nothing in your > asm says that the content of your pointer has to be up-to-date. > > The "memory" says that main memory will be changed, not that it will be > used, and if you are memcopy-ing from a structure in stack - for instance > a structure which fit in a register - you may have problems.
Why, exactly? the structure has its address taken, and therefore at the point at which the asm is invoked it'll be in memory. For sure, if you fail to use volatile the compiler may decide that the asm can be deleted. So use volatile: that's what it's for. If the compiler fails to write a struct to memory before the asm is executed, that's a bug in the compiler. > I did not really experiment with __builtin_memcpy(), is it treated > specially or like a standard function call It's treated specially. Andrew.