Hello,

I have instrumented my fortran code with "timers" in the following way:

==================================================

start_0 = MPI_Wtime()

start_1 = MPI_Wtime()
call foo()
end_1 = MPI_Wtime()
write(*,*) "timer1 = ",end1-start1

start_2 = MPI_Wtime()
call bar()
end_2 = MPI_Wtime()
write(*,*) "timer2 = ",end2-start2

end_0 = MPI_Wtime()
write(*,*) "timer0 = ",end0-start0

==================================================

When I run my code on a "small" number of processors, I find that 
timer0=timer1+timer2 with a very good precision (less than 1%).
However, as I increase the number of processors, this is not true any more: I 
can have 10%, 20% or even more discrepancy!
The more processor I use, the bigger errors are observed.

Obviously, my code is much bigger than the simple example above, but the 
principle is exactly the same.

Does anyone have an idea?

Thanks!
G.

PS:
Of course, each processor writes its own timer in an individual file: the 
discrepancy is nearly the same on every processor.

Reply via email to