This behavior happens at every call (first and following)
Here is my code (simplified): ================================================================ start_time = MPI_Wtime() call mpi_ext_barrier() new_time = MPI_Wtime()-start_time write(local_time,'(F9.1)') new_time*1.0e9_WP/(36.0_WP*36.0_WP*36.0_WP) call print_message("CAST GHOST DATA2 LOOP 1 barrier "//trim(local_time),0) do conn_index_id=1, Nconn(conn_type_id) ! loop over data this_data => block%data do while (associated(this_data)) MPI_IRECV(...) MPI_ISEND(...) this_data => this_data%next enddo endif enddo enddo start_time = MPI_Wtime() call mpi_ext_barrier() new_time = MPI_Wtime()-start_time write(local_time,'(F9.1)') new_time*1.0e9_WP/(36.0_WP*36.0_WP*36.0_WP) call print_message("CAST GHOST DATA2 LOOP 2 barrier "//trim(local_time),0) done=.false. counter = 0 do while (.not.done) do ireq=1,nreq if (recv_req(ireq)/=MPI_REQUEST_NULL) then call MPI_TEST(recv_req(ireq),found,mystatus,icommerr) if (found) then call .... counter=counter+1 endif endif enddo if (counter==nreq) then done=.true. endif enddo ================================================================ The first call to the barrier works perfectly fine, but the second one gives the strange behavior... Ghislain. Le 8 sept. 2011 à 16:53, Eugene Loh a écrit : > On 9/8/2011 7:42 AM, Ghislain Lartigue wrote: >> I will check that, but as I said in first email, this strange behaviour >> happens only in one place in my code. > Is the strange behavior on the first time, or much later on? (You seem to > imply later on, but I thought I'd ask.) > > I agree the behavior is noteworthy, but it's plausible and there's not enough > information to explain it based solely on what you've said. > > Here is one scenario. I don't know if it applies to you since I know very > little about what you're doing. I think with VampirTrace, you can collect > performance data into large buffers. Occasionally, the buffers need to be > flushed to disk. VampirTrace will wait for a good opportunity to do so -- > e.g., a global barrier. So, you execute lots of barriers, but suddenly you > hit one where VT wants to flush to disk. This takes a long time and everyone > in the barrier spends a long time in the barrier. Then, execution resumes > and barrier performance looks again like what it used to look like. > > Again, there are various scenarios to explain what you see. More information > would be needed to decide which applies to you. > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users >