Re: [OMPI users] problems with rankfile in openmpi-1.9a1r29097

2013-09-03 Thread Ralph Castain
Heck if I know what might be wrong - it works fine for me, regardless of what machine I run it from. If this is compiled with --enable-debug, try adding "--display-allocation -mca rmaps_base_verbose 5" to your cmd line to see what might be going on. On Sep 3, 2013, at 1:20 AM, Siegmar Gross

Re: [OMPI users] problems with rankfile in openmpi-1.9a1r29097

2013-09-03 Thread Siegmar Gross
Hi, > 3) I have a problem on "tyr" (Solaris 10 sparc). > > tyr rankfiles 106 mpiexec -report-bindings \ > -rf rf_tyr_semicolon -np 1 hostname > [tyr.informatik.hs-fulda.de:29849] [[53951,0],0] ORTE_ERROR_LOG: > Not found in file > > ../../../../../openmpi-1.9a1r29097/orte/mca/rmaps/rank_f

Re: [OMPI users] problems with rankfile in openmpi-1.9a1r29097

2013-09-03 Thread Siegmar Gross
Hi, > Okay, I have a fix for not specifying the number of procs when > using a rankfile. > > As for the binding pattern, the problem is a syntax error in > your rankfile. You need a semi-colon instead of a comma to > separate the sockets for rank0: > > > rank 0=bend001 slot=0:0-1,1:0-1 => rank

Re: [OMPI users] problems with rankfile in openmpi-1.9a1r29097

2013-09-02 Thread Ralph Castain
Okay, I have a fix for not specifying the number of procs when using a rankfile. As for the binding pattern, the problem is a syntax error in your rankfile. You need a semi-colon instead of a comma to separate the sockets for rank0: > rank 0=bend001 slot=0:0-1,1:0-1 => rank 0=bend001 slot=0:0-1

Re: [OMPI users] problems with rankfile in openmpi-1.9a1r29097

2013-09-02 Thread Ralph Castain
It seems to run for me on CentOS, though I note that rank 0 isn't bound to both sockets 0 and 1 as specified and I had to tell it how many procs to run: [rhc@bend001 svn-trunk]$ mpirun --report-bindings -rf rf -n 4 hostname [bend001:13297] MCW rank 0 bound to socket 0[core 0[hwt 0-1]], socket 0[