It seems to run for me on CentOS, though I note that rank 0 isn't bound to both sockets 0 and 1 as specified and I had to tell it how many procs to run:
[rhc@bend001 svn-trunk]$ mpirun --report-bindings -rf rf -n 4 hostname [bend001:13297] MCW rank 0 bound to socket 0[core 0[hwt 0-1]], socket 0[core 1[hwt 0-1]]: [BB/BB/../../../..][../../../../../..] bend001 [bend002:25899] MCW rank 3 bound to socket 1[core 7[hwt 0-1]]: [../../../../../..][../BB/../../../..] bend002 [bend002:25899] MCW rank 1 bound to socket 0[core 0[hwt 0-1]], socket 0[core 1[hwt 0-1]]: [BB/BB/../../../..][../../../../../..] bend002 [bend002:25899] MCW rank 2 bound to socket 1[core 6[hwt 0-1]]: [../../../../../..][BB/../../../../..] bend002 [rhc@bend001 svn-trunk]$ cat rf rank 0=bend001 slot=0:0-1,1:0-1 rank 1=bend002 slot=0:0-1 rank 2=bend002 slot=1:0 rank 3=bend002 slot=1:1 I'll work on those issues, but don't know why you are getting this "not allocated" error. On Sep 2, 2013, at 7:10 AM, Siegmar Gross <siegmar.gr...@informatik.hs-fulda.de> wrote: > Hi, > > I installed openmpi-1.9a1r29097 on "openSuSE Linux 12.1", "Solaris 10 > x86_64", and "Solaris 10 sparc" with "Sun C 5.12" in 64-bit mode. > Unfortunately I still have a problem with rankfiles. I reported the > problems already in May. I show the problems with Linux, although I > have the same problems on all Solaris machines as well. > > linpc1 rankfiles 99 cat rf_linpc1 > # mpiexec -report-bindings -rf rf_linpc1 hostname > rank 0=linpc1 slot=0:0-1,1:0-1 > > linpc1 rankfiles 100 mpiexec -report-bindings -rf rf_linpc1 hostname > [linpc1:23413] MCW rank 0 bound to socket 0[core 0[hwt 0]], > socket 0[core 1[hwt 0]]: [B/B][./.] > linpc1 > > > linpc1 rankfiles 101 cat rf_ex_linpc > # mpiexec -report-bindings -rf rf_ex_linpc hostname > rank 0=linpc0 slot=0:0-1,1:0-1 > rank 1=linpc1 slot=0:0-1 > rank 2=linpc1 slot=1:0 > rank 3=linpc1 slot=1:1 > > linpc1 rankfiles 102 mpiexec -report-bindings -rf rf_ex_linpc hostname > -------------------------------------------------------------------------- > The rankfile that was used claimed that a host was either not > allocated or oversubscribed its slots. Please review your rank-slot > assignments and your host allocation to ensure a proper match. Also, > some systems may require using full hostnames, such as > "host1.example.com" (instead of just plain "host1"). > > Host: linpc0 > -------------------------------------------------------------------------- > linpc1 rankfiles 103 > > > > I don't have these problems with openmpi-1.6.5a1r28554. > > linpc1 rankfiles 95 ompi_info | grep "Open MPI:" > Open MPI: 1.6.5a1r28554 > > linpc1 rankfiles 95 mpiexec -report-bindings -rf rf_linpc1 hostname > [linpc1:23583] MCW rank 0 bound to socket 0[core 0-1] > socket 1[core 0-1]: [B B][B B] (slot list 0:0-1,1:0-1) > linpc1 > > > linpc1 rankfiles 96 mpiexec -report-bindings -rf rf_ex_linpc hostname > [linpc1:23585] MCW rank 1 bound to socket 0[core 0-1]: > [B B][. .] (slot list 0:0-1) > [linpc1:23585] MCW rank 2 bound to socket 1[core 0]: > [. .][B .] (slot list 1:0) > [linpc1:23585] MCW rank 3 bound to socket 1[core 1]: > [. .][. B] (slot list 1:1) > linpc1 > linpc1 > linpc1 > [linpc0:10422] MCW rank 0 bound to socket 0[core 0-1] socket 1[core 0-1]: > [B B][B B] (slot list 0:0-1,1:0-1) > linpc0 > > > I would be grateful, if somebody can fix the problem. Thank you > very much for any help in advance. > > > Kind regards > > Siegmar > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users