Re: [OMPI users] OpenMPI job initializing problem

2014-03-06 Thread Beichuan Yan
No, I did all these and none worked. I just found, with exact the same code, data and job settings, a job can really run one day while cannot the other day. It is NOT repeatable. I don't know what the problem is: hardware? OpenMPI? PBS Pro? Anyway, I may have to give up using OpenMPI on that sy

Re: [OMPI users] OpenMPI job initializing problem

2014-03-06 Thread Gus Correa
On 03/06/2014 03:35 PM, Beichuan Yan wrote: Gus, Yes, 10.148.0.0/16 is the IB subnet. I did try others but none worked: #export TCP="--mca btl sm,openib" No run, no output If I remember right, and unless this changed in recent OMPI vervsions, you also need "self": -mca btl sm,openib,self Al

Re: [OMPI users] OpenMPI job initializing problem

2014-03-06 Thread Beichuan Yan
Gus, Yes, 10.148.0.0/16 is the IB subnet. I did try others but none worked: #export TCP="--mca btl sm,openib" No run, no output #export TCP="--mca btl sm,openib --mca btl_tcp_if_include 10.148.0.0/16" No run, no output Beichuan -Original Message- From: users [mailto:users-boun...@open-

Re: [OMPI users] OpenMPI job initializing problem

2014-03-06 Thread Gus Correa
Hi Beichuan So, it looks like that now the program runs, even though with specific settings depending on whether you're using OMPI 1.6.5 or 1.7.4, right? It looks like the problem now is performance, right? System load affects performance, but unless the network is overwhelmed, or perhaps the L

[OMPI users] CFP: Workshop on Enhancing Parallel Scientific Applications with Accelerated HPC

2014-03-06 Thread Javier Garcia Blas

Re: [OMPI users] OpenMPI job initializing problem

2014-03-06 Thread Beichuan Yan
1. For $TMPDIR and $TCP, there are four combinations by commenting on/off (note the system's default TMPDIR=/work3/yanb): export TMPDIR=/work1/home/yanb/tmp TCP="--mca btl_tcp_if_include 10.148.0.0/16" 2. I tested the 4 combinations for OpenMPI 1.6.5 and OpenMPI 1.7.4 respectively for the pure-M