Re: [OMPI users] OpenMPI failed when running across two mac machines

2012-01-20 Thread Jeff Squyres
I'll take it as a friendly fix for 1.5.5 -- please file a CMR. On Jan 20, 2012, at 2:07 PM, Ralph Castain wrote: > Looks okay to me - thanks! I'll add it in. > > On Jan 20, 2012, at 11:02 AM, Teng Lin wrote: > >> Hi, >> >> We are distributing OpenMPI as part of software suite. Therefore, the

Re: [OMPI users] rankfiles on really big nodes broken?

2012-01-20 Thread Ralph Castain
I don't see anything in the code that limits the number of procs in a rankfile. Are the attached rankfiles the ones you are trying to use? I'm wondering if there is a syntax error that is causing the problem. It would help if you could provide the complete error message output. At one time, the

[OMPI users] rankfiles on really big nodes broken?

2012-01-20 Thread Paul Kapinos
Hello, Open MPI developer! Now, we have a really nice toy: 2 Tb RAM, 16 sockets, 128 cores. (4x smaller Bull S6010 coupled by BCS chips to a single image machine) On a such big box, process pinning is vital. So we tried to use the Open MPI capabilities to pin te processes. But it seem that the

Re: [OMPI users] OpenMPI failed when running across two mac machines

2012-01-20 Thread Ralph Castain
Looks okay to me - thanks! I'll add it in. On Jan 20, 2012, at 11:02 AM, Teng Lin wrote: > Hi, > > We are distributing OpenMPI as part of software suite. Therefore, the prefix > we used for building is not expected to be the same when running on > customer's machine. However, we did manage to

[OMPI users] OpenMPI failed when running across two mac machines

2012-01-20 Thread Teng Lin
Hi, We are distributing OpenMPI as part of software suite. Therefore, the prefix we used for building is not expected to be the same when running on customer's machine. However, we did manage to get it running by setting OPLA_PREFIX, PATH and LD_LIBARAY_PATH on Linux). We tried do the same thin

[OMPI users] OpenMPI failed when running across two mac machines

2012-01-20 Thread Teng Lin
Hi, We are distributing OpenMPI as part of software suite. Therefore, the prefix we used for building is not expected to be the same when running on customer's machine. However, we did manage to get it running by setting OPLA_PREFIX, PATH and LD_LIBARAY_PATH on Linux). We tried do the same thin

Re: [OMPI users] MPI_Comm_create with unequal group arguments

2012-01-20 Thread Josh Hursey
For MPI_Comm_create -all- processes in the communicator must make the call, not just those that are in the subgroups. The 2.2 standard states that "The function is collective and must be called by all processes in the group of comm." However, this is a common misconception about the MPI_Comm_cre

Re: [OMPI users] MPI_Comm_create with unequal group arguments

2012-01-20 Thread Jens Jørgen Mortensen
On 20-01-2012 15:26, Josh Hursey wrote: That behavior is permitted by the MPI 2.2 standard. It seems that our documentation is incorrect in this regard. I'll file a bug to fix it. Just to clarify, in the MPI 2.2 standard in Section 6.4.2 (Communicator Constructors) under MPI_Comm_create it sta

Re: [OMPI users] MPI_Comm_create with unequal group arguments

2012-01-20 Thread Josh Hursey
That behavior is permitted by the MPI 2.2 standard. It seems that our documentation is incorrect in this regard. I'll file a bug to fix it. Just to clarify, in the MPI 2.2 standard in Section 6.4.2 (Communicator Constructors) under MPI_Comm_create it states: "Each process must call with a group ar

Re: [OMPI users] Checkpoint an MPI process

2012-01-20 Thread Josh Hursey
Rodrigo, Open MPI has the ability to migrate a subset of processes (in the trunk - though currently broken due to recent code movement, I'm slowing developing the fix in my spare time). The current implementation only checkpoints the migrating processes, but suspends all other processes during the

[OMPI users] MPI_Comm_create with unequal group arguments

2012-01-20 Thread Jens Jørgen Mortensen
Hi! For a long time, I have been calling MPI_Comm_create(comm, group, newcomm) with different values for group on the different processes of comm. In pseudo-code, I would create two sub-communicators from a world with 4 ranks like this: if world.rank < 2: comm = world.create([0, 1]) els

Re: [OMPI users] Checkpoint an MPI process

2012-01-20 Thread Rodrigo Oliveira
I appreciate your help. Indeed, it's better to create my own mechanism as mentioned Lloyd. Actually my application is a framework to stream processing (something like IBM System-S), in which I use Open MPI as communication layer and part of process management. One of this framework's features is t