On 8/31/07, Adam Nemet <[EMAIL PROTECTED]> wrote: > "Matt Lee" <[EMAIL PROTECTED]> writes: > > > I am seeing poor scheduling in Dhrystone where a memcpy call is > > expanded inline. > > > > memcpy (&dst, &src, 16) ==> > > > > load 1, rA + 4 > > store 1, rB + 4 > > load 2, rA + 8 > > store 2, rB + 8 > > ... > > Are you sure that there are no dependencies due to aliasing here. The > only similar thing that Dhrystone has to what you quote is between a > pointer and a global variable and in fact there is an aliasing > conflict there. > > If that is the case you can define a movmem pattern where you first > load everthing in one chunk and then store it later. See MIPS's > movmemsi pattern and the function mips_block_move_straight. > > Adam >
Adam, you are right. There is an aliasing conflict in my test case. However, I get the same effect when I use the restrict keyword on the pointers. Here is an even more reduced test case, that shows the same problem. #include <string.h> struct foo { int a[4]; } ; void func (struct foo * restrict p, struct foo * restrict q) { memcpy (p, q, sizeof (struct foo)); } Perhaps restrict doesn't work. In any case, I am trying to optimize the case where there is clearly no aliasing. Your suggestion regarding movmemsi is interesting. I have not used this pattern before and assumed that it was required only when something special must be done to do block moves. In my architecture, a block move is not special and is equivalent a series of loads and stores. Why do I need this pattern and why/how does the aliasing conflict become a non-issue when defining this pattern? I apologize if i am missing something basic here, but the GCC documentation regarding this pattern doesn't tell me much about why it is required. -- thanks, Matt