It seems to run for me on CentOS, though I note that rank 0 isn't bound to both 
sockets 0 and 1 as specified and I had to tell it how many procs to run:

[rhc@bend001 svn-trunk]$  mpirun --report-bindings -rf rf -n 4 hostname
[bend001:13297] MCW rank 0 bound to socket 0[core 0[hwt 0-1]], socket 0[core 
1[hwt 0-1]]: [BB/BB/../../../..][../../../../../..]
bend001
[bend002:25899] MCW rank 3 bound to socket 1[core 7[hwt 0-1]]: 
[../../../../../..][../BB/../../../..]
bend002
[bend002:25899] MCW rank 1 bound to socket 0[core 0[hwt 0-1]], socket 0[core 
1[hwt 0-1]]: [BB/BB/../../../..][../../../../../..]
bend002
[bend002:25899] MCW rank 2 bound to socket 1[core 6[hwt 0-1]]: 
[../../../../../..][BB/../../../../..]
bend002

[rhc@bend001 svn-trunk]$ cat rf
rank 0=bend001 slot=0:0-1,1:0-1
rank 1=bend002 slot=0:0-1
rank 2=bend002 slot=1:0
rank 3=bend002 slot=1:1

I'll work on those issues, but don't know why you are getting this "not 
allocated" error.


On Sep 2, 2013, at 7:10 AM, Siegmar Gross 
<siegmar.gr...@informatik.hs-fulda.de> wrote:

> Hi,
> 
> I installed openmpi-1.9a1r29097 on "openSuSE Linux 12.1", "Solaris 10
> x86_64", and "Solaris 10 sparc" with "Sun C 5.12" in 64-bit mode.
> Unfortunately I still have a problem with rankfiles. I reported the
> problems already in May. I show the problems with Linux, although I
> have the same problems on all Solaris machines as well.
> 
> linpc1 rankfiles 99 cat rf_linpc1
> # mpiexec -report-bindings -rf rf_linpc1 hostname
> rank 0=linpc1 slot=0:0-1,1:0-1
> 
> linpc1 rankfiles 100 mpiexec -report-bindings -rf rf_linpc1 hostname
> [linpc1:23413] MCW rank 0 bound to socket 0[core 0[hwt 0]],
>  socket 0[core 1[hwt 0]]: [B/B][./.]
> linpc1
> 
> 
> linpc1 rankfiles 101 cat rf_ex_linpc
> # mpiexec -report-bindings -rf rf_ex_linpc hostname
> rank 0=linpc0 slot=0:0-1,1:0-1
> rank 1=linpc1 slot=0:0-1
> rank 2=linpc1 slot=1:0
> rank 3=linpc1 slot=1:1
> 
> linpc1 rankfiles 102 mpiexec -report-bindings -rf rf_ex_linpc hostname
> --------------------------------------------------------------------------
> The rankfile that was used claimed that a host was either not
> allocated or oversubscribed its slots.  Please review your rank-slot
> assignments and your host allocation to ensure a proper match.  Also,
> some systems may require using full hostnames, such as
> "host1.example.com" (instead of just plain "host1").
> 
>  Host: linpc0
> --------------------------------------------------------------------------
> linpc1 rankfiles 103 
> 
> 
> 
> I don't have these problems with openmpi-1.6.5a1r28554.
> 
> linpc1 rankfiles 95 ompi_info | grep "Open MPI:"
>                Open MPI: 1.6.5a1r28554
> 
> linpc1 rankfiles 95 mpiexec -report-bindings -rf rf_linpc1 hostname
> [linpc1:23583] MCW rank 0 bound to socket 0[core 0-1]
>  socket 1[core 0-1]: [B B][B B] (slot list 0:0-1,1:0-1)
> linpc1
> 
> 
> linpc1 rankfiles 96 mpiexec -report-bindings -rf rf_ex_linpc hostname
> [linpc1:23585] MCW rank 1 bound to socket 0[core 0-1]:
>  [B B][. .] (slot list 0:0-1)
> [linpc1:23585] MCW rank 2 bound to socket 1[core 0]:
>  [. .][B .] (slot list 1:0)
> [linpc1:23585] MCW rank 3 bound to socket 1[core 1]:
>  [. .][. B] (slot list 1:1)
> linpc1
> linpc1
> linpc1
> [linpc0:10422] MCW rank 0 bound to socket 0[core 0-1] socket 1[core 0-1]:
>  [B B][B B] (slot list 0:0-1,1:0-1)
> linpc0
> 
> 
> I would be grateful, if somebody can fix the problem. Thank you
> very much for any help in advance.
> 
> 
> Kind regards
> 
> Siegmar
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users

Reply via email to