Hi,

yesterday I installed openmpi-1.8.2rc4r32485 on my machines
(Solaris 10 Sparc (tyr), Solaris 10 x86_64 (sunpc0, sunpc1),
openSUSE Linux 12.1 x86_64 (linpc0, linpc1)) with Sun C 5.12.
Today I was playing around a little bit more with rankfiles
and found the following things which may be helpful tracking
down the error. I use variations of the following rankfile
(I remove a line and adapt ranks). Many rankfiles work fine
and a few break.

tyr openmpi_1.7.x_or_newer 180 cat x-linpc0_linpc1_sunpc1_tyr 
rank 0=linpc0 slot=0:0-1;1:0-1
rank 1=linpc1 slot=1:0
rank 2=sunpc1 slot=1:0
rank 3=tyr slot=1:0

The above rankfile still breaks..

tyr openmpi_1.7.x_or_newer 186 mpiexec --report-bindings -np 4 -rf 
x-linpc0_linpc1_sunpc1_tyr hostname
--------------------------------------------------------------------------
An invalid value was supplied for an enum variable.

  Variable     : hwloc_base_report_bindings
  Value        : 1,1
  Valid values : 0: f|false|disabled, 1: t|true|enabled
--------------------------------------------------------------------------
[tyr.informatik.hs-fulda.de:21651] MCW rank 3 bound to socket 1[core 1[hwt 0]]: 
[.][B]
tyr.informatik.hs-fulda.de
[linpc0:21338] MCW rank 0 is not bound (or bound to all available processors)
[linpc1:16906] MCW rank 1 bound to socket 1[core 2[hwt 0]]: [./.][B/.]
linpc0
linpc1
sunpc1
tyr openmpi_1.7.x_or_newer 187 


tyr openmpi_1.7.x_or_newer 191 mpiexec --report-bindings -np 3 -rf 
x-linpc0_linpc1_tyr hostname
[tyr.informatik.hs-fulda.de:21685] MCW rank 2 bound to socket 1[core 1[hwt 0]]: 
[.][B]
tyr.informatik.hs-fulda.de
[linpc0:21607] MCW rank 0 is not bound (or bound to all available processors)
linpc0
[linpc1:17168] MCW rank 1 bound to socket 1[core 2[hwt 0]]: [./.][B/.]
linpc1
tyr openmpi_1.7.x_or_newer 192 


tyr openmpi_1.7.x_or_newer 193 mpiexec --report-bindings -np 3 -rf 
x-linpc0_sunpc1_tyr hostname
[tyr.informatik.hs-fulda.de:21695] MCW rank 2 bound to socket 1[core 1[hwt 0]]: 
[.][B]
tyr.informatik.hs-fulda.de
[linpc0:21673] MCW rank 0 is not bound (or bound to all available processors)
linpc0
[sunpc1:25457] MCW rank 1 bound to socket 1[core 2[hwt 0]]: [./.][B/.]
sunpc1
tyr openmpi_1.7.x_or_newer 194 


tyr openmpi_1.7.x_or_newer 195 mpiexec --report-bindings -np 3 -rf 
x-linpc0_linpc1_sunpc1 hostname
--------------------------------------------------------------------------
An invalid value was supplied for an enum variable.

  Variable     : hwloc_base_report_bindings
  Value        : 1,1
  Valid values : 0: f|false|disabled, 1: t|true|enabled
--------------------------------------------------------------------------
[linpc0:21743] MCW rank 0 is not bound (or bound to all available processors)
[linpc1:17240] MCW rank 1 bound to socket 1[core 2[hwt 0]]: [./.][B/.]
linpc1
linpc0
sunpc1
tyr openmpi_1.7.x_or_newer 196 


tyr openmpi_1.7.x_or_newer 197 mpiexec --report-bindings -np 2 -rf 
x-linpc0_sunpc1 hostname
[linpc0:21836] MCW rank 0 is not bound (or bound to all available processors)
linpc0
[sunpc1:25521] MCW rank 1 bound to socket 1[core 2[hwt 0]]: [./.][B/.]
sunpc1
tyr openmpi_1.7.x_or_newer 198 


tyr openmpi_1.7.x_or_newer 199 mpiexec --report-bindings -np 2 -rf 
x-linpc1_sunpc1 hostname
[linpc1:17335] MCW rank 0 bound to socket 1[core 2[hwt 0]]: [./.][B/.]
linpc1
[sunpc1:25583] MCW rank 1 bound to socket 1[core 2[hwt 0]]: [./.][B/.]
sunpc1
tyr openmpi_1.7.x_or_newer 200 


I would be grateful if somebody can fix the problem. Can I provide
anything else? Thank you very much any help in advance.


Kind regards

Siegmar

Reply via email to