Re: [OMPI users] help understand unhelpful ORTE error message

2015-11-30 Thread Jeff Squyres (jsquyres)
On Nov 24, 2015, at 9:31 AM, Dave Love wrote: > >> btw, we already use the force, thanks to the ob1 pml and the yoda spml > > I think that's assuming familiarity with something which leaves out some > people... FWIW, I agree: we use unhelpful names for components in Open MPI. What Gilles is s

Re: [OMPI users] help understand unhelpful ORTE error message

2015-11-24 Thread Dave Love
Gilles Gouaillardet writes: > Currently, ompi create a file in the temporary directory and then mmap it. > an obvious requirement is the temporary directory must have enough free > space for that file. > (this might be an issue on some disk less nodes) > > > a simple alternative could be to try /

Re: [OMPI users] help understand unhelpful ORTE error message

2015-11-20 Thread Gilles Gouaillardet
Currently, ompi create a file in the temporary directory and then mmap it. an obvious requirement is the temporary directory must have enough free space for that file. (this might be an issue on some disk less nodes) a simple alternative could be to try /tmp, and if there is not enough space, try

Re: [OMPI users] help understand unhelpful ORTE error message

2015-11-20 Thread Dave Love
Jeff Hammond writes: >> Doesn't mpich have the option to use sysv memory? You may want to try that >> >> > MPICH? Look, I may have earned my way onto Santa's naughty list more than > a few times, but at least I have the decency not to post MPICH questions to > the Open-MPI list ;-) > > If there

Re: [OMPI users] help understand unhelpful ORTE error message

2015-11-20 Thread Dave Love
[There must be someone better to answer this, but since I've seen it:] Jeff Hammond writes: > I have no idea what this is trying to tell me. Help? > > jhammond@nid00024:~/MPI/qoit/collectives> mpirun -n 2 ./driver.x 64 > [nid00024:00482] [[46168,0],0] ORTE_ERROR_LOG: Not found in file > ../../.

Re: [OMPI users] help understand unhelpful ORTE error message

2015-11-19 Thread Jeff Hammond
On Thu, Nov 19, 2015 at 4:11 PM, Howard Pritchard wrote: > Hi Jeff H. > > Why don't you just try configuring with > > ./configure --prefix=my_favorite_install_dir > --with-libfabric=install_dir_for_libfabric > make -j 8 install > > and see what happens? > > That was the first thing I tried. Howe

Re: [OMPI users] help understand unhelpful ORTE error message

2015-11-19 Thread Howard Pritchard
Hi Jeff, I finally got an allocation on cori - its one busy machine. Anyway, using the ompi i'd built on edison with the above recommended configure options I was able to run using either srun or mpirun on cori provided that in the later case I used mpirun -np X -N Y --mca plm slurm ./my_favorit

Re: [OMPI users] help understand unhelpful ORTE error message

2015-11-19 Thread Howard Pritchard
Hi Jeff H. Why don't you just try configuring with ./configure --prefix=my_favorite_install_dir --with-libfabric=install_dir_for_libfabric make -j 8 install and see what happens? Make sure before you configure that you have PrgEnv-gnu or PrgEnv-intel module loaded. Those were the configure/com

Re: [OMPI users] help understand unhelpful ORTE error message

2015-11-19 Thread Jeff Hammond
> > > How did you configure for Cori? You need to be using the slurm plm > component for that system. I know this sounds like gibberish. > > ../configure --with-libfabric=$HOME/OFI/install-ofi-gcc-gni-cori \ --enable-mca-static=mtl-ofi \ --enable-mca-no-build=btl-openib,

Re: [OMPI users] help understand unhelpful ORTE error message

2015-11-19 Thread Howard
Hi Jeff How did you configure for Cori? You need to be using the slurm plm component for that system. I know this sounds like gibberish. There should be a with-slurm configure option to pick up this component. Doesn't mpich have the option to use sysv memory? You may want to try that Oh

Re: [OMPI users] help understand unhelpful ORTE error message

2015-11-19 Thread Martin Siegert
Hi Jeff, On Thu 19.11.2015 09:44:20 Jeff Hammond wrote: > I have no idea what this is trying to tell me. Help? > > jhammond@nid00024:~/MPI/qoit/collectives> mpirun -n 2 ./driver.x 64 > [nid00024:00482] [[46168,0],0] ORTE_ERROR_LOG: Not found in file > ../../../../../orte/mca/plm/alps/plm_alps_mo

[OMPI users] help understand unhelpful ORTE error message

2015-11-19 Thread Jeff Hammond
I have no idea what this is trying to tell me. Help? jhammond@nid00024:~/MPI/qoit/collectives> mpirun -n 2 ./driver.x 64 [nid00024:00482] [[46168,0],0] ORTE_ERROR_LOG: Not found in file ../../../../../orte/mca/plm/alps/plm_alps_module.c at line 418 I can run the same job with srun without incide