Hi All,

I discovered a strange behavior of SPEC CPU2006 436.cactusADM
benchmark. It’s performance depends on the length of $LD_LIBRARY_PATH
variable.
The benchmark was compiled with "-O3 -funroll-loops -ffast-math
-march=core-avx2 -mtune=core-avx2" using gcc version 4.8.0 20130218.
I used Intel Software Development Emulator 5.38.0 to run AVX2 code:

$ export LD_LIBRARY_PATH=''
$ /sde-bdw-external-5.38.0-2013-01-03-lin/sde -icount -- ./cactusADM
benchADM.par > benchADM.out 2> benchADM.err
ICOUNT: 2593690591

$ export LD_LIBRARY_PATH='aaaaaaaaaaaaaaaaaaaa'
$ /sde-bdw-external-5.38.0-2013-01-03-lin/sde -icount -- ./cactusADM
benchADM.par > benchADM.out 2> benchADM.err
ICOUNT: 2100709724

In the second case 23% less instructions are executed!

The difference is caused by the code, that checks the alignment of
some pointer in the benchmark. It seems that the environment variables
(including $LD_LIBRARY_PATH) are placed in the app’s memory, this
leads to different alignment, therefore to various execution paths.

Is this a known issue? Is there anything the GCC can do to make the
alignment more consistent?

Thanks,
Ilya

Reply via email to