Just for the records: I finally found the issue here. It's a problem of both, alignment and cache thrashing. When using aligned memory (e.g. via posix_memalign()) and using a suitable offset within that memory, the effect goes away. So it's a processor effect, not a compiler issue. :-)
Best -Andreas On 16:19 Thu 09 Jul , Andreas Schäfer wrote: > Hey guys, > > I noticed a strange performance hit in one of our stencil codes, > causing it to run twice as long. > > To nail down the error, I reduced our code to the two attached demo > programs. Basically they take two matrices and average each matrix > element with its four direct neighbors. Depending on how these > matrices are allocated, the performance hit occurs -- or does not. > > Here is the diff of the two files: > @@ -17,8 +17,7 @@ > > void test(double (*grid)[GRID_WIDTH]) > { > - double (*gridOld)[GRID_WIDTH] = > - malloc(GRID_WIDTH * GRID_HEIGHT * sizeof(double)); > + double (*gridOld)[GRID_WIDTH] = gridOldArray; > double (*gridNew)[GRID_WIDTH] = gridNewArray; > printAddress(&gridNew[0][0]); > printAddress(&gridOld[0][0]); > > where gridOldArray is a statically allocated array. Depending on the > machines processor the performance hit varies from negligible to > dramatic: > > > Processor GCC Version Time(slow) Time(fast) Performance Hit > ------------------ ----------- ---------- ---------- --------------- > Core 2 Quad Q9550 4.3.3 12.19s 5.11s 138% > Athlon 64 X2 3800+ 4.3.3 7.34s 6.61s 11% > Opteron 2378 4.3.2 6.13s 5.60s 9% > Opteron 2352 4.3.3 8.16s 7.96s 2% > Xeon 3.00GHz 4.3.3 18.98s 14.67s 29% > > Apparently Intel systems are more susceptible to this effect. > > Can anyone reproduce these results? > And could anyone explain, why this happens? > > Thanks in advance > -Andreas > > > -- > ============================================ > Andreas Schäfer > Cluster and Metacomputing Working Group > Friedrich-Schiller-Universität Jena, Germany > 0049/3641-9-46376 > PGP/GPG key via keyserver > I'm a bright... http://www.the-brights.net > ============================================ > > (\___/) > (+'.'+) > (")_(") > This is Bunny. Copy and paste Bunny into your > signature to help him gain world domination! > #define GRID_WIDTH 1024 > #define GRID_HEIGHT 1024 > #define MAX_STEPS 1024 > > #include <stdio.h> > #include <stdlib.h> > #include <string.h> > > double grid[GRID_HEIGHT][GRID_WIDTH]; > double gridNewArray[GRID_HEIGHT][GRID_WIDTH]; > double gridOldArray[GRID_HEIGHT][GRID_WIDTH]; > > void printAddress(void *p) > { > printf("address %p\n", p); > } > > void test(double (*grid)[GRID_WIDTH]) > { > double (*gridOld)[GRID_WIDTH] = gridOldArray; > double (*gridNew)[GRID_WIDTH] = gridNewArray; > printAddress(&gridNew[0][0]); > printAddress(&gridOld[0][0]); > > // copy initial state > for (int y = 0; y < GRID_HEIGHT; ++y) { > memcpy(&gridOld[y][0], &grid[y][0], GRID_WIDTH * sizeof(double)); > memset(&gridNew[y][0], 0, GRID_WIDTH * sizeof(double)); > } > > // update matrices > for (int step = 0; step < MAX_STEPS; ++step) { > for (int y = 1; y < GRID_HEIGHT-1; ++y) > for (int x = 1; x < GRID_WIDTH-1; ++x) > gridNew[y][x] = > (gridOld[y-1][x ] + > gridOld[y ][x-1] + > gridOld[y ][x ] + > gridOld[y ][x+1] + > gridOld[y+1][x ]) * 0.2; > double (*tmp)[GRID_WIDTH] = gridOld; > gridOld = gridNew; > gridNew = tmp; > } > > // copy result back > for (int y = 0; y < GRID_HEIGHT; ++y) > memcpy(&grid[y][0], &gridOld[y][0], GRID_WIDTH * sizeof(double)); > } > > void setupGrid() > { > for (int y = 0; y < GRID_HEIGHT; ++y) > for (int x = 0; x < GRID_WIDTH; ++x) > grid[y][x] = 0; > > for (int y = 10; y < 20; ++y) > for (int x = 10; x < 20; ++x) > grid[y][x] = 1; > } > > int main(int argc, char** argv) > { > setupGrid(); > test(grid); > printf("res: %f\n", grid[10][10]); // prevent dead code elimination > return 0; > } -- ========================================================== Andreas Schäfer HPC and Grid Computing Chair of Computer Science 3 Friedrich-Alexander-Universität Erlangen-Nürnberg, Germany +49 9131 85-27910 PGP/GPG key via keyserver I'm a bright... http://www.the-brights.net ========================================================== (\___/) (+'.'+) (")_(") This is Bunny. Copy and paste Bunny into your signature to help him gain world domination!
pgpE1l3ZdnuF1.pgp
Description: PGP signature