Re: [OMPI users] How are the Open MPI processes spawned?

2011-11-22 Thread Ralph Castain
On Nov 22, 2011, at 10:10 AM, Paul Kapinos wrote: > Hello Ralph, hello all. > >> No real ideas, I'm afraid. We regularly launch much larger jobs than that >> using ssh without problem, > I was also able to run a 288-node-job yesterday - the size alone is not the > problem... > > > >> so it

Re: [OMPI users] How are the Open MPI processes spawned?

2011-11-22 Thread Paul Kapinos
Hello Ralph, hello all. No real ideas, I'm afraid. We regularly launch much larger jobs than that using ssh without problem, I was also able to run a 288-node-job yesterday - the size alone is not the problem... so it is likely something about the local setup of that node that is causing t

Re: [OMPI users] Shared memory optimizations in OMPI

2011-11-22 Thread Jeff Squyres
All the shared memory code is in the "sm" BTL (byte transfer layer) component: ompi/mca/btl/sm. All the TCP MPI code is in the "tcp" BTL component: ompi/mca/btl/tcp. Think of "ob1" as the MPI engine that is the bottom of MPI_SEND, MPI_RECV, and friends. It takes a message to be sent, determin

Re: [OMPI users] Shared memory optimizations in OMPI

2011-11-22 Thread Shamik Ganguly
Thanks a lot Jeff. PIN is a dynamic binary instrumentation tool from Intel. It runs on top of the Binary in the MPI node. When its given function calls to instrument, it will insert trappings before/after that funtion call in the binary of the program you are instrumenting and you can insert your

Re: [OMPI users] Shared memory optimizations in OMPI

2011-11-22 Thread Jeff Squyres
On Nov 22, 2011, at 1:09 AM, Shamik Ganguly wrote: > I want to trace when the MPI library prevents an MPI_Send from going to the > socket and makes it access shared memory because the target node is on the > same chip (CMP). I want to use PIN to trace this. Can you please give me some > pointe

Re: [OMPI users] MPI_MAX_PORT_NAME different in C and Fortran headers

2011-11-22 Thread Jeff Squyres
Yoinks! Thanks for the heads up, and for subsequently filing bugs for us (sorry for the delay in replying; I was fully occupied at SC last week, and this week is pretty sparse because of the US Thanksgiving holiday). You're right that this is a simple "Fortran value doesn't agree with C value"

[OMPI users] Shared memory optimizations in OMPI

2011-11-22 Thread Shamik Ganguly
Hi, I want to trace when the MPI library prevents an MPI_Send from going to the socket and makes it access shared memory because the target node is on the same chip (CMP). I want to use PIN to trace this. Can you please give me some pointers about which functions are taking this decision so that