I wonder if someone might have possible ideas to explore as to why this
program might not be working correctly under TotalView. Essentially a
user is running a very simple hello world like program that does this:
#include <stdio.h>
#include <stdlib.h>
#include <mpi.h>
int main(int argc, char **argv)
{
MPI_Init( &argc, &argv );
int rank, size;
MPI_Comm_rank( MPI_COMM_WORLD, &rank );
MPI_Comm_size( MPI_COMM_WORLD, &size );
printf("rank = %d, size = %d\n", rank, size );
MPI_Finalize();
exit( EXIT_SUCCESS );
}
When run under
mpirun -np 8 ./foo
It spits out 8 lines with ranks 0 through 7 and a size or 8. But when
run under TotalView, he sees 8 lines, all of rank 0, and size of 1. So
it looks like the processes never combine up when on his system under
TV. To make this a bit more interesting, when I run the same program
here, I do NOT see this separate behavior. The processes all join up
and everything looks okay. My machine, as his, is a Mac running the
10.6.8 Darwin. To try and keep this replicable, I just used the native
OpenMPI on Darwin. TotalView is started up with
totalview foo
and then Parallel is chosen from the Startup Parameters window, and Open
MPI and 8 processes are chosen. Any thoughts about why I might see one
8 process job, and he sees 8 single process jobs? Are there any hidden
debug flags I can use?
Thanks,
PeterT