Re: [OMPI users] LAMA of openmpi-1.7.3 is unstable

2013-11-07 Thread Ralph Castain
I suspect something else is going on there - I can't imagine how the LAMA mapper could be interacting with the Torque launcher. The check for adequate resources (per the error message) is done long before we get to the launcher. I'll have to let the LAMA supporters chase it down. Thanks Ralph

Re: [OMPI users] LAMA of openmpi-1.7.3 is unstable

2013-11-07 Thread tmishima
Thanks, Ralph. This is an additional information. Just execute directly on the node without Torque: mpirun -np 8 -report-bindings -mca rmaps lama -mca rmaps_lama_bind 1c Myprog Then it also works, which means the combination of LAMA and Torque would case the problem. Tetsuya Mishima > Okay,

Re: [OMPI users] LAMA of openmpi-1.7.3 is unstable

2013-11-07 Thread Ralph Castain
Okay, so the problem is a bug in LAMA itself. I'll file a ticket and let the LAMA folks look into it. On Nov 7, 2013, at 4:18 PM, tmish...@jcity.maeda.co.jp wrote: > > > Hi Ralph, > > I quickly tried 2 runs: > > mpirun -report-bindings -bind-to core Myprog > mpirun -machinefile pbs_hosts -np

Re: [OMPI users] LAMA of openmpi-1.7.3 is unstable

2013-11-07 Thread tmishima
Hi Ralph, I quickly tried 2 runs: mpirun -report-bindings -bind-to core Myprog mpirun -machinefile pbs_hosts -np ${NPROCS} -report-bindings -bind-to core Myprog It works fine in both cases on node03 and node08. Regards, Tetsuya Mishima > What happens if you drop the LAMA request and instead

Re: [OMPI users] LAMA of openmpi-1.7.3 is unstable

2013-11-07 Thread Ralph Castain
What happens if you drop the LAMA request and instead run mpirun -report-bindings -bind-to core Myprog This would do the same thing - does it work? If so, then we know it is a problem in the LAMA mapper. If not, then it is likely a problem in a different section of the code. On Nov 7, 2013,

[OMPI users] LAMA of openmpi-1.7.3 is unstable

2013-11-07 Thread tmishima
Dear openmpi developers, I tried the new function LAMA of openmpi-1.7.3 and unfortunately it is not stable under my environment, which is built with torque. (1) I used 4 scripts as shown below to clarify the problem: (COMMON PART) #!/bin/sh #PBS -l nodes=node03:ppn=8 / nodes=node08:ppn=8 expor