Hi,
I guess I know what went wrong. The workload per thread was so small
(reading the CPU cycle counter and that was it) that the first threads will
have finished while the tasks were still being distributed.
Due to the lack of core binding, some cores would therefore be used several
times and could find the code already in their caches. So the < 10000
cycles cases
shown for OMP are most likely coming from these cases.
I will update the benchmark to do some real work - no point to repeat the
measurements in a loop before that.
/// Jürgen
On 05/11/2014 05:02 PM, Juergen Sauermann wrote:
Hi Elias,
thanks, already interesting. If you could loop around the core count:
*for ((i=1; $i<=80; ++i)); do**
** ./Parallel $i**
** ./Parallel_OMP $i**
**done*
then I could understand the data better. Also not sure if something
is wrong with the benchmark program. On my new 4-core with OMP I get
fluctuations from:
eedjsa@server65 ~/apl-1.3/tools $ ./Parallel_OMP 4
Pass 0: 4 cores/threads, 8229949 cycles total
Pass 1: 4 cores/threads, 8262 cycles total
Pass 2: 4 cores/threads, 4035 cycles total
Pass 3: 4 cores/threads, 4126 cycles total
Pass 4: 4 cores/threads, 4179 cycles total
to:
eedjsa@server65 ~/apl-1.3/tools $ ./Parallel_OMP 4
Pass 0: 4 cores/threads, 11368032 cycles total
Pass 1: 4 cores/threads, 4042228 cycles total
Pass 2: 4 cores/threads, 7251419 cycles total
Pass 3: 4 cores/threads, 3846 cycles total
Pass 4: 4 cores/threads, 2725 cycles total
The fluctuations with the manual parallel for are smaller:
Pass 0: 4 cores/threads, 87225 cycles total
Pass 1: 4 cores/threads, 245046 cycles total
Pass 2: 4 cores/threads, 84632 cycles total
Pass 3: 4 cores/threads, 63619 cycles total
Pass 4: 4 cores/threads, 93437 cycles total
but still considerable. The picture so far suggests that OMP
fluctuates much
more (in the start-up + sync time) than manual with the highest OMP
start-up above manual
and the lowest far below. One change on my TODO list is to use
futexes instead of mutexes
(like OMP does), probably not an issue under Solaris sunce futextes
are linux-specific.
/// Jürgen
On 05/11/2014 04:23 AM, Elias Mårtenson wrote:
Here are the files that I promised earlier.
Regards,
Elias