Okay, I have a fix for not specifying the number of procs when using a rankfile.
As for the binding pattern, the problem is a syntax error in your rankfile. You need a semi-colon instead of a comma to separate the sockets for rank0: > rank 0=bend001 slot=0:0-1,1:0-1 => rank 0=bend001 slot=0:0-1;1:0-1 This is required because you use commas to list specific cores - e.g., slot=0:0,1,4,6 HTH Ralph On Sep 2, 2013, at 7:52 AM, Ralph Castain <r...@open-mpi.org> wrote: > It seems to run for me on CentOS, though I note that rank 0 isn't bound to > both sockets 0 and 1 as specified and I had to tell it how many procs to run: > > [rhc@bend001 svn-trunk]$ mpirun --report-bindings -rf rf -n 4 hostname > [bend001:13297] MCW rank 0 bound to socket 0[core 0[hwt 0-1]], socket 0[core > 1[hwt 0-1]]: [BB/BB/../../../..][../../../../../..] > bend001 > [bend002:25899] MCW rank 3 bound to socket 1[core 7[hwt 0-1]]: > [../../../../../..][../BB/../../../..] > bend002 > [bend002:25899] MCW rank 1 bound to socket 0[core 0[hwt 0-1]], socket 0[core > 1[hwt 0-1]]: [BB/BB/../../../..][../../../../../..] > bend002 > [bend002:25899] MCW rank 2 bound to socket 1[core 6[hwt 0-1]]: > [../../../../../..][BB/../../../../..] > bend002 > > [rhc@bend001 svn-trunk]$ cat rf > rank 0=bend001 slot=0:0-1,1:0-1 > rank 1=bend002 slot=0:0-1 > rank 2=bend002 slot=1:0 > rank 3=bend002 slot=1:1 > > I'll work on those issues, but don't know why you are getting this "not > allocated" error. > > > On Sep 2, 2013, at 7:10 AM, Siegmar Gross > <siegmar.gr...@informatik.hs-fulda.de> wrote: > >> Hi, >> >> I installed openmpi-1.9a1r29097 on "openSuSE Linux 12.1", "Solaris 10 >> x86_64", and "Solaris 10 sparc" with "Sun C 5.12" in 64-bit mode. >> Unfortunately I still have a problem with rankfiles. I reported the >> problems already in May. I show the problems with Linux, although I >> have the same problems on all Solaris machines as well. >> >> linpc1 rankfiles 99 cat rf_linpc1 >> # mpiexec -report-bindings -rf rf_linpc1 hostname >> rank 0=linpc1 slot=0:0-1,1:0-1 >> >> linpc1 rankfiles 100 mpiexec -report-bindings -rf rf_linpc1 hostname >> [linpc1:23413] MCW rank 0 bound to socket 0[core 0[hwt 0]], >> socket 0[core 1[hwt 0]]: [B/B][./.] >> linpc1 >> >> >> linpc1 rankfiles 101 cat rf_ex_linpc >> # mpiexec -report-bindings -rf rf_ex_linpc hostname >> rank 0=linpc0 slot=0:0-1,1:0-1 >> rank 1=linpc1 slot=0:0-1 >> rank 2=linpc1 slot=1:0 >> rank 3=linpc1 slot=1:1 >> >> linpc1 rankfiles 102 mpiexec -report-bindings -rf rf_ex_linpc hostname >> -------------------------------------------------------------------------- >> The rankfile that was used claimed that a host was either not >> allocated or oversubscribed its slots. Please review your rank-slot >> assignments and your host allocation to ensure a proper match. Also, >> some systems may require using full hostnames, such as >> "host1.example.com" (instead of just plain "host1"). >> >> Host: linpc0 >> -------------------------------------------------------------------------- >> linpc1 rankfiles 103 >> >> >> >> I don't have these problems with openmpi-1.6.5a1r28554. >> >> linpc1 rankfiles 95 ompi_info | grep "Open MPI:" >> Open MPI: 1.6.5a1r28554 >> >> linpc1 rankfiles 95 mpiexec -report-bindings -rf rf_linpc1 hostname >> [linpc1:23583] MCW rank 0 bound to socket 0[core 0-1] >> socket 1[core 0-1]: [B B][B B] (slot list 0:0-1,1:0-1) >> linpc1 >> >> >> linpc1 rankfiles 96 mpiexec -report-bindings -rf rf_ex_linpc hostname >> [linpc1:23585] MCW rank 1 bound to socket 0[core 0-1]: >> [B B][. .] (slot list 0:0-1) >> [linpc1:23585] MCW rank 2 bound to socket 1[core 0]: >> [. .][B .] (slot list 1:0) >> [linpc1:23585] MCW rank 3 bound to socket 1[core 1]: >> [. .][. B] (slot list 1:1) >> linpc1 >> linpc1 >> linpc1 >> [linpc0:10422] MCW rank 0 bound to socket 0[core 0-1] socket 1[core 0-1]: >> [B B][B B] (slot list 0:0-1,1:0-1) >> linpc0 >> >> >> I would be grateful, if somebody can fix the problem. Thank you >> very much for any help in advance. >> >> >> Kind regards >> >> Siegmar >> >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users >