Hi all
I'm trying to run my code in a cluster here with infiniband. It is in
Fortran 95/2003 and uses MPI-IO for output. I'm using openmpi 1.5.5. It
runs has been running fine but for a particular configuration, using all
of the cluster cores (128, divided in 4 boxes with 4 Octo-core Opterons
each), it hangs while calling MPI-IO.
So what I am asking is help in debugging this. This is the relevant part
of the code
CALL MPI_File_set_view(fh, disp, etype, filetype, &
TRIM(datarep), MPI_INFO_NULL, ierr)
IF(DEBGON)THEN
IF(master)THEN
WRITE(logfl,'(/,"DBG: WriteMPI_IO going to write file.")')
FLUSH(logfl)
ENDIF
CALL MPI_Barrier(world, ierr)
ENDIF
CALL MPI_file_write_at_all(fh, offset, arr, dim, &
etype, status, ierr)
And it hangs just after the flush, so apparently in the MPI_write_at_all
call.
Any ideas of what to do or where to look are welcomed.
best,
Ricardo Reis
'Non Serviam'
PhD/MSc Mechanical Engineering | Lic. Aerospace Engineering
Computational Fluid Dynamics, High Performance Computing, Turbulence
http://www.lasef.ist.utl.pt
Cultural Instigator @ Rádio Zero
http://www.radiozero.pt
http://www.flickr.com/photos/rreis/
contacts: gtalk: kyriu...@gmail.com skype: kyriusan
Institutional Address:
Ricardo J.N. dos Reis
IDMEC, Instituto Superior Técnico, Technical University of Lisbon
Av. Rovisco Pais
1049-001 Lisboa
Portugal
- email sent with alpine 2.00 -