This behavior happens at every call (first and following)
Here is my code (simplified):
================================================================
start_time = MPI_Wtime()
call mpi_ext_barrier()
new_time = MPI_Wtime()-start_time
write(local_time,'(F9.1)') new_time*1.0e9_WP/(36.0_WP*36.0_WP*36.0_WP)
call print_message("CAST GHOST DATA2 LOOP 1 barrier "//trim(local_time),0)
do conn_index_id=1, Nconn(conn_type_id)
! loop over data
this_data => block%data
do while (associated(this_data))
MPI_IRECV(...)
MPI_ISEND(...)
this_data => this_data%next
enddo
endif
enddo
enddo
start_time = MPI_Wtime()
call mpi_ext_barrier()
new_time = MPI_Wtime()-start_time
write(local_time,'(F9.1)') new_time*1.0e9_WP/(36.0_WP*36.0_WP*36.0_WP)
call print_message("CAST GHOST DATA2 LOOP 2 barrier "//trim(local_time),0)
done=.false.
counter = 0
do while (.not.done)
do ireq=1,nreq
if (recv_req(ireq)/=MPI_REQUEST_NULL) then
call MPI_TEST(recv_req(ireq),found,mystatus,icommerr)
if (found) then
call ....
counter=counter+1
endif
endif
enddo
if (counter==nreq) then
done=.true.
endif
enddo
================================================================
The first call to the barrier works perfectly fine, but the second one gives
the strange behavior...
Ghislain.
Le 8 sept. 2011 à 16:53, Eugene Loh a écrit :
> On 9/8/2011 7:42 AM, Ghislain Lartigue wrote:
>> I will check that, but as I said in first email, this strange behaviour
>> happens only in one place in my code.
> Is the strange behavior on the first time, or much later on? (You seem to
> imply later on, but I thought I'd ask.)
>
> I agree the behavior is noteworthy, but it's plausible and there's not enough
> information to explain it based solely on what you've said.
>
> Here is one scenario. I don't know if it applies to you since I know very
> little about what you're doing. I think with VampirTrace, you can collect
> performance data into large buffers. Occasionally, the buffers need to be
> flushed to disk. VampirTrace will wait for a good opportunity to do so --
> e.g., a global barrier. So, you execute lots of barriers, but suddenly you
> hit one where VT wants to flush to disk. This takes a long time and everyone
> in the barrier spends a long time in the barrier. Then, execution resumes
> and barrier performance looks again like what it used to look like.
>
> Again, there are various scenarios to explain what you see. More information
> would be needed to decide which applies to you.
> _______________________________________________
> users mailing list
> [email protected]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>