Re: [OMPI users] Mpirun 1.5.4 problems when request > 28 slots

2014-07-19 Thread Lane, William
Yes there is a second HPC Sun Grid Engine cluster on which I've run this openMPI test code dozens of times on upwards of 400 slots through SGE using qsub and qrsh, but this was using a much older version of openMPI (1.3.3 I believe). On that particular cluster the open files hard and soft limits we

Re: [OMPI users] Mpirun 1.5.4 problems when request > 28 slots

2014-07-19 Thread Ralph Castain
Not for this test case size. You should be just fine with the default values. If I understand you correctly, you've run this app at scale before on another cluster without problem? On Jul 19, 2014, at 1:34 PM, Lane, William wrote: > Ralph, > > It's hard to imagine it's the openMPI code becaus

Re: [OMPI users] Mpirun 1.5.4 problems when request > 28 slots

2014-07-19 Thread Lane, William
Ralph, It's hard to imagine it's the openMPI code because I've tested this code extensively on another cluster with 400 nodes and never had any problems. But I'll try using the hello_c example in any case. Is it still recommended to raise the open files soft and hard limits to 4096? Or should even

Re: [OMPI users] Mpirun 1.5.4 problems when request > 28 slots

2014-07-19 Thread Ralph Castain
That's a pretty old OMPI version, and we don't really support it any longer. However, I can provide some advice: * have you tried running the simple "hello_c" example we provide? This would at least tell you if the problem is in your app, which is what I'd expect given your description * try u

[OMPI users] Mpirun 1.5.4 problems when request > 28 slots

2014-07-19 Thread Lane, William
I'm getting consistent errors of the form: "mpirun noticed that process rank 3 with PID 802 on node csclprd3-0-8 exited on signal 11 (Segmentation fault)." whenever I request more than 28 slots. These errors even occur when I run mpirun locally on a compute node that has 32 slots (8 cores, 16 wi