Hi,
I installed openmpi-1.9a1r27362 and my tests are more awful than
with openmpi-1.9a1r27342. When I try the commands which I reported
in my email from September 18th, I get a segmentation fault now.
The following commands worked in openmpi-1.9a1r27342, but I
get segmentation faults with "Address not mapped" in
openmpi-1.9a1r27362.
mpiexec -report-bindings -np 4 -bynode -bind-to hwthread \
-display-devel-map date
[rs0...:23490] MCW rank 1 bound to : [../B./../..][../../../..]
[rs0...:23490] MCW rank 2 bound to : [../../B./..][../../../..]
[rs0...:23490] MCW rank 3 bound to : [../../../B.][../../../..]
[rs0...:23490] MCW rank 0 bound to : [B./../../..][../../../..]
mpiexec -report-bindings -np 5 -map-by core -bind-to hwthread \
-display-devel-map date
[rs0...:23619] MCW rank 3 bound to : [../../../B.][../../../..]
[rs0...:23619] MCW rank 4 bound to : [../../../..][B./../../..]
[rs0...:23619] MCW rank 0 bound to : [B./../../..][../../../..]
[rs0...:23619] MCW rank 1 bound to : [../B./../..][../../../..]
[rs0...:23619] MCW rank 2 bound to : [../../B./..][../../../..]
mpiexec -report-bindings -np 4 -map-by hwthread -bind-to hwthread \
-display-devel-map date
[rs0...:23676] MCW rank 1 bound to : [.B/../../..][../../../..]
[rs0...:23676] MCW rank 2 bound to : [../B./../..][../../../..]
[rs0...:23676] MCW rank 3 bound to : [../.B/../..][../../../..]
[rs0...:23676] MCW rank 0 bound to : [B./../../..][../../../..]
mpiexec -report-bindings -np 2 -bind-to hwthread date
[rs0...:19704] MCW rank 0 bound to : [B./../../..][../../../..]
[rs0...:19704] MCW rank 1 bound to : [../B./../..][../../../..]
mpiexec -report-bindings -np 2 -map-by core -bind-to hwthread date
[rs0...:19793] MCW rank 0 bound to : [B./../../..][../../../..]
[rs0...:19793] MCW rank 1 bound to : [../B./../..][../../../..]
mpiexec -report-bindings -np 2 -map-by hwthread -bind-to hwthread date
[rs0...:19788] MCW rank 0 bound to : [B./../../..][../../../..]
[rs0...:19788] MCW rank 1 bound to : [.B/../../..][../../../..]
I still get a segmentation fault with "Address not mapped" for the
following commands.
mpiexec -report-bindings -np 2 -map-by slot -bind-to hwthread date
mpiexec -report-bindings -np 2 -map-by numa -bind-to hwthread date
mpiexec -report-bindings -np 2 -map-by node -bind-to hwthread date
mpiexec -report-bindings -np 5 -bynode -bind-to hwthread \
-display-devel-map date
mpiexec -report-bindings -np 1 -map-by core -bind-to hwthread date
mpiexec -report-bindings -np 6 -map-by core -bind-to hwthread \
-display-devel-map date
mpiexec -report-bindings -np 1 -map-by socket -bind-to hwthread date
I don't get bus errors any longer for the following commands,
but now I get segmentation faults with "Address not mapped".
mpiexec -report-bindings -np 2 -bynode -bind-to hwthread date
mpiexec -report-bindings -np 2 -map-by socket -bind-to hwthread date
Now I get a bus error with the following commands.
mpiexec -report-bindings -np 3 -bind-to hwthread date
mpiexec -report-bindings -np 1 -map-by hwthread -bind-to hwthread date
mpiexec -report-bindings -np 5 -map-by hwthread -bind-to hwthread \
-display-devel-map date
The following command works.
mpiexec -report-bindings -np 1 -bynode -bind-to hwthread date
[rs0...:06795] MCW rank 0 bound to : [B./../../..][../../../..]
Why do I get an output for "-bynode" and a bus error for
"-map-by node"? I thought that is the same?
rs0 topo 114 mpiexec -report-bindings -np 1 -bynode \
-bind-to hwthread date
[rs0...:07108] MCW rank 0 bound to : [B./../../..][../../../..]
Wed Sep 26 12:23:11 CEST 2012
rs0 topo 115 mpiexec -report-bindings -np 1 -map-by node \
-bind-to hwthread date
[rs0:07113] *** Process received signal ***
[rs0:07113] Signal: Bus Error (10)
The output is sometimes once more different, when I add the option
"-mca ess_base_verbose 5", e.g., in error_5a.txt everything is fine,
and I get the above mentioned bus error in error_5b.txt . I have
attached all files to keep the email readable. Hopefully somebody
can find out what is wrong and fix the problem.
mpiexec -report-bindings -np 4 -bynode -bind-to hwthread \
-display-devel-map -mca ess_base_verbose 5 date >& error_1.txt
mpiexec -report-bindings -np 5 -map-by core -bind-to hwthread \
-display-devel-map -mca ess_base_verbose 5 date >& error_2.txt
mpiexec -report-bindings -np 2 -map-by hwthread -bind-to hwthread \
-mca ess_base_verbose 5 date >& error_3.txt
mpiexec -report-bindings -np 2 -map-by hwthread -bind-to hwthread \
-mca ess_base_verbose 5 date >& error_4.txt
mpiexec -report-bindings -np 1 -map-by node \
-bind-to hwthread -mca ess_base_verbose 5 date >& error_5a.txt
mpiexec -report-bindings -np 1 -map-by node \
-bind-to hwthread date >& error_5b.txt
Thank you very much for any help in advance.
Kind regards
Siegmar
> Please try and keep the User list on the messages - allows others
> to chime in.
>
> You can see the topology by adding "-mca ess_base_verbose 5" to
> your command line. You'll get other stuff as well, and you'll
> need to --enable-debug in your configure.
>
>
> On Sep 24, 2012, at 4:47 AM, Siegmar Gross
> <[email protected]> wrote:
>
> > Hi,
> >
> >> The 1.7 series has a completely different way of handling node
> >> topology than was used in the 1.6 series. It provides some
> >> enhanced features, but it does have some drawbacks in the case
> >> where the topology info isn't correct. I fear you are running
> >> into this problem (again).
> >>
> >> All the commands you show here work fine for me on a Linux
> >> x86_64 box using 1.7r27361 on a Westmere 6-core single-socket
> >> machine with hyperthreads enabled. I cannot replicate any of
> >> the reported problems, so there isn't much I can do at this point.
> >>
> >> As I've said before, the root problem here appears to be some
> >> hwloc-related issue with your setup. Until that gets resolved
> >> so we get correct topology info, I'm not sure what can be done
> >> to resolve what you are seeing. I'll raise the question of
> >> possibly providing some alternative support for setups like
> >> yours that just can't get topology info, but that would
> >> definitely be a long-term question.
> >
> > Can we check if you get wrong topology info or which info you get
> > at all? Can you tell me a file and location where I can print the
> > values of relevant variables on my architecture? Perhaps that can
> > help to determine what goes wrong. I would use the latest trunk
> > tarball and can make the test a day later, because all changes on
> > our "installation server" are mirrored in the night to a our file
> > server for all machines.
> >
> >
> > Kind regards
> >
> > Siegmar
> >
> >
> >
> >
> >> On Sep 23, 2012, at 3:20 AM, Siegmar Gross
> > <[email protected]> wrote:
> >>
> >>> Hi,
> >>>
> >>> yesterday I installed openmpi-1.7a1r27358 and it has an improved
> >>> error message compared to openmpi-1.6.2, but doesn't show process bindings
> >>> and has some other problems as well.
> >>>
> >>>
> >>> "sunpc0" and "linpc0" are equipped with two dual-core processors running
> >>> Solaris 10 x86_64 and Linux x86_64 resp. "tyr" is a dual-processor machine
> >>> running Solaris 10 Sparc.
> >>>
> >>> tyr fd1026 105 mpiexec -np 2 -host sunpc0 -report-bindings \
> >>> -map-by core -bind-to-core date
> >>> Sun Sep 23 11:46:36 CEST 2012
> >>> Sun Sep 23 11:46:36 CEST 2012
> >>>
> >>> tyr fd1026 106 mpicc -showme
> >>> cc -I/usr/local/openmpi-1.7_64_cc/include -mt -m64
> >>> -L/usr/local/openmpi-1.7_64_cc/lib64 -lmpi -lpicl -lm -lkstat -llgrp
> >>> -lsocket -lnsl -lrt -lm
> >>>
> >>>
> >>> openmpi-1.6.2 shows process bindings.
> >>>
> >>> tyr fd1026 103 mpiexec -np 2 -host sunpc0 -report-bindings \
> >>> -bycore -bind-to-core date
> >>> Sun Sep 23 12:09:06 CEST 2012
> >>> [sunpc0:13197] MCW rank 0 bound to socket 0[core 0]: [B .][. .]
> >>> [sunpc0:13197] MCW rank 1 bound to socket 0[core 1]: [. B][. .]
> >>> Sun Sep 23 12:09:06 CEST 2012
> >>>
> >>>
> >>> tyr fd1026 104 mpicc -showme
> >>> cc -I/usr/local/openmpi-1.6.2_64_cc/include -mt -m64
> >>> -L/usr/local/openmpi-1.6.2_64_cc/lib64 -lmpi -lm -lkstat -llgrp
> >>> -lsocket -lnsl -lrt -lm
> >>>
> >>>
> >>> On my Linux machine I get a warning.
> >>>
> >>> tyr fd1026 113 mpiexec -np 2 -host linpc0 -report-bindings \
> >>> -map-by core -bind-to-core date
> >>> --------------------------------------------------------------------------
> >>> WARNING: a request was made to bind a process. While the system
> >>> supports binding the process itself, at least one node does NOT
> >>> support binding memory to the process location.
> >>>
> >>> Node: linpc0
> >>>
> >>> This is a warning only; your job will continue, though performance may
> >>> be degraded.
> >>> --------------------------------------------------------------------------
> >>> Sun Sep 23 11:56:04 CEST 2012
> >>> Sun Sep 23 11:56:04 CEST 2012
> >>>
> >>>
> >>>
> >>> Everything works fine with openmpi-1.6.2.
> >>>
> >>> tyr fd1026 106 mpiexec -np 2 -host linpc0 -report-bindings \
> >>> -bycore -bind-to-core date
> >>> [linpc0:15808] MCW rank 0 bound to socket 0[core 0]: [B .][. .]
> >>> [linpc0:15808] MCW rank 1 bound to socket 0[core 1]: [. B][. .]
> >>> Sun Sep 23 12:11:47 CEST 2012
> >>> Sun Sep 23 12:11:47 CEST 2012
> >>>
> >>>
> >>>
> >>>
> >>> Om my Solaris Sparc machine I get the following errors.
> >>>
> >>>
> >>> tyr fd1026 121 mpiexec -np 2 -report-bindings -map-by core -bind-to-core
> > date
> >>> [tyr.informatik.hs-fulda.de:23773] [[32457,0],0] ORTE_ERROR_LOG: Value
> >>> out
> > of bounds in file
> >>> ../../../../openmpi-1.7a1r27358/orte/mca/odls/base/odls_base_default_fns.c
> >>>
> > at line 847
> >>> [tyr.informatik.hs-fulda.de:23773] [[32457,0],0] ORTE_ERROR_LOG: Value
> >>> out
> > of bounds in file
> >>> ../../../../openmpi-1.7a1r27358/orte/mca/odls/base/odls_base_default_fns.c
> >>>
> > at line 1414
> >>> [tyr.informatik.hs-fulda.de:23773] [[32457,0],0] ORTE_ERROR_LOG: Value
> >>> out
> > of bounds in file
> >>> ../../../../openmpi-1.7a1r27358/orte/mca/odls/base/odls_base_default_fns.c
> >>>
> > at line 847
> >>> [tyr.informatik.hs-fulda.de:23773] [[32457,0],0] ORTE_ERROR_LOG: Value
> >>> out
> > of bounds in file
> >>> ../../../../openmpi-1.7a1r27358/orte/mca/odls/base/odls_base_default_fns.c
> >>>
> > at line 1414
> >>>
> >>>
> >>>
> >>> tyr fd1026 122 mpiexec -np 2 -host tyr -report-bindings -map-by core
> > -bind-to core date
> >>> --------------------------------------------------------------------------
> >>> All nodes which are allocated for this job are already filled.
> >>> --------------------------------------------------------------------------
> >>>
> >>>
> >>> Once more everything works fine with openmpi-1.6.2.
> >>>
> >>> tyr fd1026 109 mpiexec -np 2 -report-bindings -bycore -bind-to-core date
> >>> [tyr.informatik.hs-fulda.de:23869] MCW rank 0 bound to socket 0[core 0]:
> > [B][.]
> >>> [tyr.informatik.hs-fulda.de:23869] MCW rank 1 bound to socket 1[core 0]:
> > [.][B]
> >>> Sun Sep 23 12:14:09 CEST 2012
> >>> Sun Sep 23 12:14:09 CEST 2012
> >>>
> >>> tyr fd1026 110 mpiexec -np 2 -host tyr -report-bindings -bycore
> > -bind-to-core date
> >>> [tyr.informatik.hs-fulda.de:23877] MCW rank 0 bound to socket 0[core 0]:
> > [B][.]
> >>> [tyr.informatik.hs-fulda.de:23877] MCW rank 1 bound to socket 1[core 0]:
> > [.][B]
> >>> Sun Sep 23 12:16:05 CEST 2012
> >>> Sun Sep 23 12:16:05 CEST 2012
> >>>
> >>>
> >>> Kind regards
> >>>
> >>> Siegmar
> >>>
> >>> _______________________________________________
> >>> users mailing list
> >>> [email protected]
> >>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> >>
> >>
> >
>
>
[rs0.informatik.hs-fulda.de:07146] mca:base:select:( ess) Querying component
[env]
[rs0.informatik.hs-fulda.de:07146] mca:base:select:( ess) Skipping component
[env]. Query failed to return a module
[rs0.informatik.hs-fulda.de:07146] mca:base:select:( ess) Querying component
[hnp]
[rs0.informatik.hs-fulda.de:07146] mca:base:select:( ess) Query of component
[hnp] set priority to 100
[rs0.informatik.hs-fulda.de:07146] mca:base:select:( ess) Querying component
[singleton]
[rs0.informatik.hs-fulda.de:07146] mca:base:select:( ess) Skipping component
[singleton]. Query failed to return a module
[rs0.informatik.hs-fulda.de:07146] mca:base:select:( ess) Querying component
[tool]
[rs0.informatik.hs-fulda.de:07146] mca:base:select:( ess) Skipping component
[tool]. Query failed to return a module
[rs0.informatik.hs-fulda.de:07146] mca:base:select:( ess) Selected component
[hnp]
[rs0.informatik.hs-fulda.de:07146] [[INVALID],INVALID] Topology Info:
[rs0.informatik.hs-fulda.de:07146] Type: Machine Number of child objects: 1
Name=NULL
total=33554432KB
OSName=SunOS
OSRelease=5.10
OSVersion=Generic_147440-21
Architecture=sun4u
Cpuset: 0x0000ffff
Online: 0x0000ffff
Allowed: 0x0000ffff
Bind CPU proc: TRUE
Bind CPU thread: TRUE
Bind MEM proc: TRUE
Bind MEM thread: TRUE
Type: NUMANode Number of child objects: 2
Name=NULL
local=33554432KB
total=33554432KB
Cpuset: 0x0000ffff
Online: 0x0000ffff
Allowed: 0x0000ffff
Type: Socket Number of child objects: 4
Name=NULL
CPUType=sparcv9
CPUModel=SPARC64_VII
Cpuset: 0x000000ff
Online: 0x000000ff
Allowed: 0x000000ff
Type: Core Number of child objects: 2
Name=NULL
Cpuset: 0x00000003
Online: 0x00000003
Allowed: 0x00000003
Type: PU Number of child objects: 0
Name=NULL
Cpuset: 0x00000001
Online: 0x00000001
Allowed: 0x00000001
Type: PU Number of child objects: 0
Name=NULL
Cpuset: 0x00000002
Online: 0x00000002
Allowed: 0x00000002
Type: Core Number of child objects: 2
Name=NULL
Cpuset: 0x0000000c
Online: 0x0000000c
Allowed: 0x0000000c
Type: PU Number of child objects: 0
Name=NULL
Cpuset: 0x00000004
Online: 0x00000004
Allowed: 0x00000004
Type: PU Number of child objects: 0
Name=NULL
Cpuset: 0x00000008
Online: 0x00000008
Allowed: 0x00000008
Type: Core Number of child objects: 2
Name=NULL
Cpuset: 0x00000030
Online: 0x00000030
Allowed: 0x00000030
Type: PU Number of child objects: 0
Name=NULL
Cpuset: 0x00000010
Online: 0x00000010
Allowed: 0x00000010
Type: PU Number of child objects: 0
Name=NULL
Cpuset: 0x00000020
Online: 0x00000020
Allowed: 0x00000020
Type: Core Number of child objects: 2
Name=NULL
Cpuset: 0x000000c0
Online: 0x000000c0
Allowed: 0x000000c0
Type: PU Number of child objects: 0
Name=NULL
Cpuset: 0x00000040
Online: 0x00000040
Allowed: 0x00000040
Type: PU Number of child objects: 0
Name=NULL
Cpuset: 0x00000080
Online: 0x00000080
Allowed: 0x00000080
Type: Socket Number of child objects: 4
Name=NULL
CPUType=sparcv9
CPUModel=SPARC64_VII
Cpuset: 0x0000ff00
Online: 0x0000ff00
Allowed: 0x0000ff00
Type: Core Number of child objects: 2
Name=NULL
Cpuset: 0x00000300
Online: 0x00000300
Allowed: 0x00000300
Type: PU Number of child objects: 0
Name=NULL
Cpuset: 0x00000100
Online: 0x00000100
Allowed: 0x00000100
Type: PU Number of child objects: 0
Name=NULL
Cpuset: 0x00000200
Online: 0x00000200
Allowed: 0x00000200
Type: Core Number of child objects: 2
Name=NULL
Cpuset: 0x00000c00
Online: 0x00000c00
Allowed: 0x00000c00
Type: PU Number of child objects: 0
Name=NULL
Cpuset: 0x00000400
Online: 0x00000400
Allowed: 0x00000400
Type: PU Number of child objects: 0
Name=NULL
Cpuset: 0x00000800
Online: 0x00000800
Allowed: 0x00000800
Type: Core Number of child objects: 2
Name=NULL
Cpuset: 0x00003000
Online: 0x00003000
Allowed: 0x00003000
Type: PU Number of child objects: 0
Name=NULL
Cpuset: 0x00001000
Online: 0x00001000
Allowed: 0x00001000
Type: PU Number of child objects: 0
Name=NULL
Cpuset: 0x00002000
Online: 0x00002000
Allowed: 0x00002000
Type: Core Number of child objects: 2
Name=NULL
Cpuset: 0x0000c000
Online: 0x0000c000
Allowed: 0x0000c000
Type: PU Number of child objects: 0
Name=NULL
Cpuset: 0x00004000
Online: 0x00004000
Allowed: 0x00004000
Type: PU Number of child objects: 0
Name=NULL
Cpuset: 0x00008000
Online: 0x00008000
Allowed: 0x00008000
Mapper requested: NULL Last mapper: round_robin Mapping policy: BYNODE
Ranking policy: NODE Binding policy: HWTHREAD[HWTHREAD] Cpu set: NULL PPR:
NULL
Num new daemons: 0 New daemon starting vpid INVALID
Num nodes: 1
Data for node: rs0.informatik.hs-fulda.de Launch id: -1 State: 2
Daemon: [[25421,0],0] Daemon launched: True
Num slots: 1 Slots in use: 1 Oversubscribed: TRUE
Num slots allocated: 1 Max slots: 0
Username on node: NULL
Num procs: 4 Next node_rank: 4
Data for proc: [[25421,1],0]
Pid: 0 Local rank: 0 Node rank: 0 App rank: 0
State: INITIALIZED Restarts: 0 App_context: 0 Locale:
0-15 Binding: 0[0]
Data for proc: [[25421,1],1]
Pid: 0 Local rank: 1 Node rank: 1 App rank: 1
State: INITIALIZED Restarts: 0 App_context: 0 Locale:
0-15 Binding: 2[2]
Data for proc: [[25421,1],2]
Pid: 0 Local rank: 2 Node rank: 2 App rank: 2
State: INITIALIZED Restarts: 0 App_context: 0 Locale:
0-15 Binding: 4[4]
Data for proc: [[25421,1],3]
Pid: 0 Local rank: 3 Node rank: 3 App rank: 3
State: INITIALIZED Restarts: 0 App_context: 0 Locale:
0-15 Binding: 6[6]
--------------------------------------------------------------------------
mpiexec noticed that process rank 1 with PID 7150 on node
rs0.informatik.hs-fulda.de exited on signal 11 (Segmentation Fault).
--------------------------------------------------------------------------
Wed Sep 26 12:38:53 CEST 2012
Wed Sep 26 12:38:53 CEST 2012
[rs0:07154] *** Process received signal ***
[rs0:07154] Signal: Bus Error (10)
[rs0:07154] Signal code: Invalid address alignment (1)
[rs0:07154] Failing at address: 620900000019
/usr/local/openmpi-1.9_64_cc/lib64/libopen-rte.so.0.0.0:opal_backtrace_print+0x14
/usr/local/openmpi-1.9_64_cc/lib64/libopen-rte.so.0.0.0:0x503e30
/lib/sparcv9/libc.so.1:0xd8684
/lib/sparcv9/libc.so.1:0xcc1f8
/lib/sparcv9/libc.so.1:0xcc404
/usr/local/openmpi-1.9_64_cc/lib64/libopen-rte.so.0.0.0:0x572eb0 [ Signal
2128894800 (?)]
/usr/local/openmpi-1.9_64_cc/lib64/libopen-rte.so.0.0.0:opal_hwloc_base_cset2str+0x64
/usr/local/openmpi-1.9_64_cc/lib64/openmpi/mca_odls_default.so:0x126f8
/usr/local/openmpi-1.9_64_cc/lib64/openmpi/mca_odls_default.so:0x135f0
/usr/local/openmpi-1.9_64_cc/lib64/libopen-rte.so.0.0.0:orte_odls_base_default_launch_local+0x1e6c
/usr/local/openmpi-1.9_64_cc/lib64/libopen-rte.so.0.0.0:0x53468c
/usr/local/openmpi-1.9_64_cc/lib64/libopen-rte.so.0.0.0:0x5348b8
/usr/local/openmpi-1.9_64_cc/lib64/libopen-rte.so.0.0.0:opal_libevent2019_event_base_loop+0x1e8
/usr/local/openmpi-1.9_64_cc/bin/orterun:orterun+0x1ce4
/usr/local/openmpi-1.9_64_cc/bin/orterun:main+0x24
/usr/local/openmpi-1.9_64_cc/bin/orterun:_start+0x12c
[rs0:07154] *** End of error message ***
[rs0.informatik.hs-fulda.de:07146] MCW rank 0 bound to :
[B./../../..][../../../..]
[rs0:07150] *** Process received signal ***
[rs0:07150] Signal: Segmentation Fault (11)
[rs0:07150] Signal code: Address not mapped (1)
[rs0:07150] Failing at address: 8
/usr/local/openmpi-1.9_64_cc/lib64/libopen-rte.so.0.0.0:opal_backtrace_print+0x14
/usr/local/openmpi-1.9_64_cc/lib64/libopen-rte.so.0.0.0:0x503e30
/lib/sparcv9/libc.so.1:0xd8684
/lib/sparcv9/libc.so.1:0xcc1f8
/lib/sparcv9/libc.so.1:0xcc404
/usr/local/openmpi-1.9_64_cc/lib64/libopen-rte.so.0.0.0:0x572eb0 [ Signal
2128894752 (?)]
/usr/local/openmpi-1.9_64_cc/lib64/libopen-rte.so.0.0.0:opal_hwloc_base_cset2str+0x64
/usr/local/openmpi-1.9_64_cc/lib64/openmpi/mca_odls_default.so:0x126f8
/usr/local/openmpi-1.9_64_cc/lib64/openmpi/mca_odls_default.so:0x135f0
/usr/local/openmpi-1.9_64_cc/lib64/libopen-rte.so.0.0.0:orte_odls_base_default_launch_local+0x1e6c
/usr/local/openmpi-1.9_64_cc/lib64/libopen-rte.so.0.0.0:0x53468c
/usr/local/openmpi-1.9_64_cc/lib64/libopen-rte.so.0.0.0:0x5348b8
/usr/local/openmpi-1.9_64_cc/lib64/libopen-rte.so.0.0.0:opal_libevent2019_event_base_loop+0x1e8
/usr/local/openmpi-1.9_64_cc/bin/orterun:orterun+0x1ce4
/usr/local/openmpi-1.9_64_cc/bin/orterun:main+0x24
/usr/local/openmpi-1.9_64_cc/bin/orterun:_start+0x12c
[rs0:07150] *** End of error message ***
[rs0.informatik.hs-fulda.de:07146] MCW rank 2 bound to :
[../../B./..][../../../..]
[rs0.informatik.hs-fulda.de:07155] mca:base:select:( ess) Querying component
[env]
[rs0.informatik.hs-fulda.de:07155] mca:base:select:( ess) Skipping component
[env]. Query failed to return a module
[rs0.informatik.hs-fulda.de:07155] mca:base:select:( ess) Querying component
[hnp]
[rs0.informatik.hs-fulda.de:07155] mca:base:select:( ess) Query of component
[hnp] set priority to 100
[rs0.informatik.hs-fulda.de:07155] mca:base:select:( ess) Querying component
[singleton]
[rs0.informatik.hs-fulda.de:07155] mca:base:select:( ess) Skipping component
[singleton]. Query failed to return a module
[rs0.informatik.hs-fulda.de:07155] mca:base:select:( ess) Querying component
[tool]
[rs0.informatik.hs-fulda.de:07155] mca:base:select:( ess) Skipping component
[tool]. Query failed to return a module
[rs0.informatik.hs-fulda.de:07155] mca:base:select:( ess) Selected component
[hnp]
[rs0.informatik.hs-fulda.de:07155] [[INVALID],INVALID] Topology Info:
[rs0.informatik.hs-fulda.de:07155] Type: Machine Number of child objects: 1
Name=NULL
total=33554432KB
OSName=SunOS
OSRelease=5.10
OSVersion=Generic_147440-21
Architecture=sun4u
Cpuset: 0x0000ffff
Online: 0x0000ffff
Allowed: 0x0000ffff
Bind CPU proc: TRUE
Bind CPU thread: TRUE
Bind MEM proc: TRUE
Bind MEM thread: TRUE
Type: NUMANode Number of child objects: 2
Name=NULL
local=33554432KB
total=33554432KB
Cpuset: 0x0000ffff
Online: 0x0000ffff
Allowed: 0x0000ffff
Type: Socket Number of child objects: 4
Name=NULL
CPUType=sparcv9
CPUModel=SPARC64_VII
Cpuset: 0x000000ff
Online: 0x000000ff
Allowed: 0x000000ff
Type: Core Number of child objects: 2
Name=NULL
Cpuset: 0x00000003
Online: 0x00000003
Allowed: 0x00000003
Type: PU Number of child objects: 0
Name=NULL
Cpuset: 0x00000001
Online: 0x00000001
Allowed: 0x00000001
Type: PU Number of child objects: 0
Name=NULL
Cpuset: 0x00000002
Online: 0x00000002
Allowed: 0x00000002
Type: Core Number of child objects: 2
Name=NULL
Cpuset: 0x0000000c
Online: 0x0000000c
Allowed: 0x0000000c
Type: PU Number of child objects: 0
Name=NULL
Cpuset: 0x00000004
Online: 0x00000004
Allowed: 0x00000004
Type: PU Number of child objects: 0
Name=NULL
Cpuset: 0x00000008
Online: 0x00000008
Allowed: 0x00000008
Type: Core Number of child objects: 2
Name=NULL
Cpuset: 0x00000030
Online: 0x00000030
Allowed: 0x00000030
Type: PU Number of child objects: 0
Name=NULL
Cpuset: 0x00000010
Online: 0x00000010
Allowed: 0x00000010
Type: PU Number of child objects: 0
Name=NULL
Cpuset: 0x00000020
Online: 0x00000020
Allowed: 0x00000020
Type: Core Number of child objects: 2
Name=NULL
Cpuset: 0x000000c0
Online: 0x000000c0
Allowed: 0x000000c0
Type: PU Number of child objects: 0
Name=NULL
Cpuset: 0x00000040
Online: 0x00000040
Allowed: 0x00000040
Type: PU Number of child objects: 0
Name=NULL
Cpuset: 0x00000080
Online: 0x00000080
Allowed: 0x00000080
Type: Socket Number of child objects: 4
Name=NULL
CPUType=sparcv9
CPUModel=SPARC64_VII
Cpuset: 0x0000ff00
Online: 0x0000ff00
Allowed: 0x0000ff00
Type: Core Number of child objects: 2
Name=NULL
Cpuset: 0x00000300
Online: 0x00000300
Allowed: 0x00000300
Type: PU Number of child objects: 0
Name=NULL
Cpuset: 0x00000100
Online: 0x00000100
Allowed: 0x00000100
Type: PU Number of child objects: 0
Name=NULL
Cpuset: 0x00000200
Online: 0x00000200
Allowed: 0x00000200
Type: Core Number of child objects: 2
Name=NULL
Cpuset: 0x00000c00
Online: 0x00000c00
Allowed: 0x00000c00
Type: PU Number of child objects: 0
Name=NULL
Cpuset: 0x00000400
Online: 0x00000400
Allowed: 0x00000400
Type: PU Number of child objects: 0
Name=NULL
Cpuset: 0x00000800
Online: 0x00000800
Allowed: 0x00000800
Type: Core Number of child objects: 2
Name=NULL
Cpuset: 0x00003000
Online: 0x00003000
Allowed: 0x00003000
Type: PU Number of child objects: 0
Name=NULL
Cpuset: 0x00001000
Online: 0x00001000
Allowed: 0x00001000
Type: PU Number of child objects: 0
Name=NULL
Cpuset: 0x00002000
Online: 0x00002000
Allowed: 0x00002000
Type: Core Number of child objects: 2
Name=NULL
Cpuset: 0x0000c000
Online: 0x0000c000
Allowed: 0x0000c000
Type: PU Number of child objects: 0
Name=NULL
Cpuset: 0x00004000
Online: 0x00004000
Allowed: 0x00004000
Type: PU Number of child objects: 0
Name=NULL
Cpuset: 0x00008000
Online: 0x00008000
Allowed: 0x00008000
Mapper requested: NULL Last mapper: round_robin Mapping policy: BYCORE
Ranking policy: SLOT Binding policy: HWTHREAD[HWTHREAD] Cpu set: NULL PPR:
NULL
Num new daemons: 0 New daemon starting vpid INVALID
Num nodes: 1
Data for node: rs0.informatik.hs-fulda.de Launch id: -1 State: 2
Daemon: [[25428,0],0] Daemon launched: True
Num slots: 1 Slots in use: 1 Oversubscribed: TRUE
Num slots allocated: 1 Max slots: 0
Username on node: NULL
Num procs: 5 Next node_rank: 5
Data for proc: [[25428,1],0]
Pid: 0 Local rank: 0 Node rank: 0 App rank: 0
State: INITIALIZED Restarts: 0 App_context: 0 Locale:
0-1 Binding: 0[0]
Data for proc: [[25428,1],1]
Pid: 0 Local rank: 1 Node rank: 1 App rank: 1
State: INITIALIZED Restarts: 0 App_context: 0 Locale:
2-3 Binding: 2[2]
Data for proc: [[25428,1],2]
Pid: 0 Local rank: 2 Node rank: 2 App rank: 2
State: INITIALIZED Restarts: 0 App_context: 0 Locale:
4-5 Binding: 4[4]
Data for proc: [[25428,1],3]
Pid: 0 Local rank: 3 Node rank: 3 App rank: 3
State: INITIALIZED Restarts: 0 App_context: 0 Locale:
6-7 Binding: 6[6]
Data for proc: [[25428,1],4]
Pid: 0 Local rank: 4 Node rank: 4 App rank: 4
State: INITIALIZED Restarts: 0 App_context: 0 Locale:
8-9 Binding: 8[8]
--------------------------------------------------------------------------
mpiexec noticed that process rank 0 with PID 7157 on node
rs0.informatik.hs-fulda.de exited on signal 10 (Bus Error).
--------------------------------------------------------------------------
[rs0:07157] *** Process received signal ***
[rs0:07157] Signal: Bus Error (10)
[rs0:07157] Signal code: Invalid address alignment (1)
[rs0:07157] Failing at address: 284f4d50495f4d
/usr/local/openmpi-1.9_64_cc/lib64/libopen-rte.so.0.0.0:opal_backtrace_print+0x14
/usr/local/openmpi-1.9_64_cc/lib64/libopen-rte.so.0.0.0:0x503e30
/lib/sparcv9/libc.so.1:0xd8684
/lib/sparcv9/libc.so.1:0xcc1f8
/lib/sparcv9/libc.so.1:0xcc404
/usr/local/openmpi-1.9_64_cc/lib64/libopen-rte.so.0.0.0:0x572eb0 [ Signal
2128894800 (?)]
/usr/local/openmpi-1.9_64_cc/lib64/libopen-rte.so.0.0.0:opal_hwloc_base_cset2str+0x64
/usr/local/openmpi-1.9_64_cc/lib64/openmpi/mca_odls_default.so:0x126f8
/usr/local/openmpi-1.9_64_cc/lib64/openmpi/mca_odls_default.so:0x135f0
/usr/local/openmpi-1.9_64_cc/lib64/libopen-rte.so.0.0.0:orte_odls_base_default_launch_local+0x1e6c
/usr/local/openmpi-1.9_64_cc/lib64/libopen-rte.so.0.0.0:0x53468c
/usr/local/openmpi-1.9_64_cc/lib64/libopen-rte.so.0.0.0:0x5348b8
/usr/local/openmpi-1.9_64_cc/lib64/libopen-rte.so.0.0.0:opal_libevent2019_event_base_loop+0x1e8
/usr/local/openmpi-1.9_64_cc/bin/orterun:orterun+0x1ce4
/usr/local/openmpi-1.9_64_cc/bin/orterun:main+0x24
/usr/local/openmpi-1.9_64_cc/bin/orterun:_start+0x12c
[rs0:07157] *** End of error message ***
[rs0:07159] *** Process received signal ***
[rs0:07159] Signal: Segmentation Fault (11)
[rs0:07159] Signal code: Address not mapped (1)
[rs0:07159] Failing at address: 2300009000008
/usr/local/openmpi-1.9_64_cc/lib64/libopen-rte.so.0.0.0:opal_backtrace_print+0x14
/usr/local/openmpi-1.9_64_cc/lib64/libopen-rte.so.0.0.0:0x503e30
/lib/sparcv9/libc.so.1:0xd8684
/lib/sparcv9/libc.so.1:0xcc1f8
/lib/sparcv9/libc.so.1:0xcc404
/usr/local/openmpi-1.9_64_cc/lib64/libopen-rte.so.0.0.0:0x572eb0 [ Signal
2128894752 (?)]
/usr/local/openmpi-1.9_64_cc/lib64/libopen-rte.so.0.0.0:opal_hwloc_base_cset2str+0x64
/usr/local/openmpi-1.9_64_cc/lib64/openmpi/mca_odls_default.so:0x126f8
/usr/local/openmpi-1.9_64_cc/lib64/openmpi/mca_odls_default.so:0x135f0
/usr/local/openmpi-1.9_64_cc/lib64/libopen-rte.so.0.0.0:orte_odls_base_default_launch_local+0x1e6c
/usr/local/openmpi-1.9_64_cc/lib64/libopen-rte.so.0.0.0:0x53468c
/usr/local/openmpi-1.9_64_cc/lib64/libopen-rte.so.0.0.0:0x5348b8
/usr/local/openmpi-1.9_64_cc/lib64/libopen-rte.so.0.0.0:opal_libevent2019_event_base_loop+0x1e8
/usr/local/openmpi-1.9_64_cc/bin/orterun:orterun+0x1ce4
/usr/local/openmpi-1.9_64_cc/bin/orterun:main+0x24
/usr/local/openmpi-1.9_64_cc/bin/orterun:_start+0x12c
[rs0:07159] *** End of error message ***
[rs0:07161] *** Process received signal ***
[rs0:07161] Signal: Segmentation Fault (11)
[rs0:07161] Signal code: Address not mapped (1)
[rs0:07161] Failing at address: 900000001210e10
/usr/local/openmpi-1.9_64_cc/lib64/libopen-rte.so.0.0.0:opal_backtrace_print+0x14
/usr/local/openmpi-1.9_64_cc/lib64/libopen-rte.so.0.0.0:0x503e30
/lib/sparcv9/libc.so.1:0xd8684
/lib/sparcv9/libc.so.1:0xcc1f8
/lib/sparcv9/libc.so.1:0xcc404
/usr/local/openmpi-1.9_64_cc/lib64/libopen-rte.so.0.0.0:0x572eb0 [ Signal
2128894752 (?)]
/usr/local/openmpi-1.9_64_cc/lib64/libopen-rte.so.0.0.0:opal_hwloc_base_cset2str+0x64
/usr/local/openmpi-1.9_64_cc/lib64/openmpi/mca_odls_default.so:0x126f8
/usr/local/openmpi-1.9_64_cc/lib64/openmpi/mca_odls_default.so:0x135f0
/usr/local/openmpi-1.9_64_cc/lib64/libopen-rte.so.0.0.0:orte_odls_base_default_launch_local+0x1e6c
/usr/local/openmpi-1.9_64_cc/lib64/libopen-rte.so.0.0.0:0x53468c
/usr/local/openmpi-1.9_64_cc/lib64/libopen-rte.so.0.0.0:0x5348b8
/usr/local/openmpi-1.9_64_cc/lib64/libopen-rte.so.0.0.0:opal_libevent2019_event_base_loop+0x1e8
/usr/local/openmpi-1.9_64_cc/bin/orterun:orterun+0x1ce4
/usr/local/openmpi-1.9_64_cc/bin/orterun:main+0x24
/usr/local/openmpi-1.9_64_cc/bin/orterun:_start+0x12c
[rs0:07161] *** End of error message ***
[rs0:07163] *** Process received signal ***
[rs0:07163] Signal: Bus Error (10)
[rs0:07163] Signal code: Invalid address alignment (1)
[rs0:07163] Failing at address: 9000000011f
/usr/local/openmpi-1.9_64_cc/lib64/libopen-rte.so.0.0.0:opal_backtrace_print+0x14
/usr/local/openmpi-1.9_64_cc/lib64/libopen-rte.so.0.0.0:0x503e30
/lib/sparcv9/libc.so.1:0xd8684
/lib/sparcv9/libc.so.1:0xcc1f8
/lib/sparcv9/libc.so.1:0xcc404
/usr/local/openmpi-1.9_64_cc/lib64/libopen-rte.so.0.0.0:0x572eb0 [ Signal
2128894800 (?)]
/usr/local/openmpi-1.9_64_cc/lib64/libopen-rte.so.0.0.0:opal_hwloc_base_cset2str+0x64
/usr/local/openmpi-1.9_64_cc/lib64/openmpi/mca_odls_default.so:0x126f8
/usr/local/openmpi-1.9_64_cc/lib64/openmpi/mca_odls_default.so:0x135f0
/usr/local/openmpi-1.9_64_cc/lib64/libopen-rte.so.0.0.0:orte_odls_base_default_launch_local+0x1e6c
/usr/local/openmpi-1.9_64_cc/lib64/libopen-rte.so.0.0.0:0x53468c
/usr/local/openmpi-1.9_64_cc/lib64/libopen-rte.so.0.0.0:0x5348b8
/usr/local/openmpi-1.9_64_cc/lib64/libopen-rte.so.0.0.0:opal_libevent2019_event_base_loop+0x1e8
/usr/local/openmpi-1.9_64_cc/bin/orterun:orterun+0x1ce4
/usr/local/openmpi-1.9_64_cc/bin/orterun:main+0x24
/usr/local/openmpi-1.9_64_cc/bin/orterun:_start+0x12c
[rs0:07163] *** End of error message ***
[rs0:07165] *** Process received signal ***
[rs0:07165] Signal: Bus Error (10)
[rs0:07165] Signal code: Invalid address alignment (1)
[rs0:07165] Failing at address: 766572626f73655d
/usr/local/openmpi-1.9_64_cc/lib64/libopen-rte.so.0.0.0:opal_backtrace_print+0x14
/usr/local/openmpi-1.9_64_cc/lib64/libopen-rte.so.0.0.0:0x503e30
/lib/sparcv9/libc.so.1:0xd8684
/lib/sparcv9/libc.so.1:0xcc1f8
/lib/sparcv9/libc.so.1:0xcc404
/usr/local/openmpi-1.9_64_cc/lib64/libopen-rte.so.0.0.0:0x572eb0 [ Signal
2128894800 (?)]
/usr/local/openmpi-1.9_64_cc/lib64/libopen-rte.so.0.0.0:opal_hwloc_base_cset2str+0x64
/usr/local/openmpi-1.9_64_cc/lib64/openmpi/mca_odls_default.so:0x126f8
/usr/local/openmpi-1.9_64_cc/lib64/openmpi/mca_odls_default.so:0x135f0
/usr/local/openmpi-1.9_64_cc/lib64/libopen-rte.so.0.0.0:orte_odls_base_default_launch_local+0x1e6c
/usr/local/openmpi-1.9_64_cc/lib64/libopen-rte.so.0.0.0:0x53468c
/usr/local/openmpi-1.9_64_cc/lib64/libopen-rte.so.0.0.0:0x5348b8
/usr/local/openmpi-1.9_64_cc/lib64/libopen-rte.so.0.0.0:opal_libevent2019_event_base_loop+0x1e8
/usr/local/openmpi-1.9_64_cc/bin/orterun:orterun+0x1ce4
/usr/local/openmpi-1.9_64_cc/bin/orterun:main+0x24
/usr/local/openmpi-1.9_64_cc/bin/orterun:_start+0x12c
[rs0:07165] *** End of error message ***
[rs0.informatik.hs-fulda.de:07166] mca:base:select:( ess) Querying component
[env]
[rs0.informatik.hs-fulda.de:07166] mca:base:select:( ess) Skipping component
[env]. Query failed to return a module
[rs0.informatik.hs-fulda.de:07166] mca:base:select:( ess) Querying component
[hnp]
[rs0.informatik.hs-fulda.de:07166] mca:base:select:( ess) Query of component
[hnp] set priority to 100
[rs0.informatik.hs-fulda.de:07166] mca:base:select:( ess) Querying component
[singleton]
[rs0.informatik.hs-fulda.de:07166] mca:base:select:( ess) Skipping component
[singleton]. Query failed to return a module
[rs0.informatik.hs-fulda.de:07166] mca:base:select:( ess) Querying component
[tool]
[rs0.informatik.hs-fulda.de:07166] mca:base:select:( ess) Skipping component
[tool]. Query failed to return a module
[rs0.informatik.hs-fulda.de:07166] mca:base:select:( ess) Selected component
[hnp]
[rs0.informatik.hs-fulda.de:07166] [[INVALID],INVALID] Topology Info:
[rs0.informatik.hs-fulda.de:07166] Type: Machine Number of child objects: 1
Name=NULL
total=33554432KB
OSName=SunOS
OSRelease=5.10
OSVersion=Generic_147440-21
Architecture=sun4u
Cpuset: 0x0000ffff
Online: 0x0000ffff
Allowed: 0x0000ffff
Bind CPU proc: TRUE
Bind CPU thread: TRUE
Bind MEM proc: TRUE
Bind MEM thread: TRUE
Type: NUMANode Number of child objects: 2
Name=NULL
local=33554432KB
total=33554432KB
Cpuset: 0x0000ffff
Online: 0x0000ffff
Allowed: 0x0000ffff
Type: Socket Number of child objects: 4
Name=NULL
CPUType=sparcv9
CPUModel=SPARC64_VII
Cpuset: 0x000000ff
Online: 0x000000ff
Allowed: 0x000000ff
Type: Core Number of child objects: 2
Name=NULL
Cpuset: 0x00000003
Online: 0x00000003
Allowed: 0x00000003
Type: PU Number of child objects: 0
Name=NULL
Cpuset: 0x00000001
Online: 0x00000001
Allowed: 0x00000001
Type: PU Number of child objects: 0
Name=NULL
Cpuset: 0x00000002
Online: 0x00000002
Allowed: 0x00000002
Type: Core Number of child objects: 2
Name=NULL
Cpuset: 0x0000000c
Online: 0x0000000c
Allowed: 0x0000000c
Type: PU Number of child objects: 0
Name=NULL
Cpuset: 0x00000004
Online: 0x00000004
Allowed: 0x00000004
Type: PU Number of child objects: 0
Name=NULL
Cpuset: 0x00000008
Online: 0x00000008
Allowed: 0x00000008
Type: Core Number of child objects: 2
Name=NULL
Cpuset: 0x00000030
Online: 0x00000030
Allowed: 0x00000030
Type: PU Number of child objects: 0
Name=NULL
Cpuset: 0x00000010
Online: 0x00000010
Allowed: 0x00000010
Type: PU Number of child objects: 0
Name=NULL
Cpuset: 0x00000020
Online: 0x00000020
Allowed: 0x00000020
Type: Core Number of child objects: 2
Name=NULL
Cpuset: 0x000000c0
Online: 0x000000c0
Allowed: 0x000000c0
Type: PU Number of child objects: 0
Name=NULL
Cpuset: 0x00000040
Online: 0x00000040
Allowed: 0x00000040
Type: PU Number of child objects: 0
Name=NULL
Cpuset: 0x00000080
Online: 0x00000080
Allowed: 0x00000080
Type: Socket Number of child objects: 4
Name=NULL
CPUType=sparcv9
CPUModel=SPARC64_VII
Cpuset: 0x0000ff00
Online: 0x0000ff00
Allowed: 0x0000ff00
Type: Core Number of child objects: 2
Name=NULL
Cpuset: 0x00000300
Online: 0x00000300
Allowed: 0x00000300
Type: PU Number of child objects: 0
Name=NULL
Cpuset: 0x00000100
Online: 0x00000100
Allowed: 0x00000100
Type: PU Number of child objects: 0
Name=NULL
Cpuset: 0x00000200
Online: 0x00000200
Allowed: 0x00000200
Type: Core Number of child objects: 2
Name=NULL
Cpuset: 0x00000c00
Online: 0x00000c00
Allowed: 0x00000c00
Type: PU Number of child objects: 0
Name=NULL
Cpuset: 0x00000400
Online: 0x00000400
Allowed: 0x00000400
Type: PU Number of child objects: 0
Name=NULL
Cpuset: 0x00000800
Online: 0x00000800
Allowed: 0x00000800
Type: Core Number of child objects: 2
Name=NULL
Cpuset: 0x00003000
Online: 0x00003000
Allowed: 0x00003000
Type: PU Number of child objects: 0
Name=NULL
Cpuset: 0x00001000
Online: 0x00001000
Allowed: 0x00001000
Type: PU Number of child objects: 0
Name=NULL
Cpuset: 0x00002000
Online: 0x00002000
Allowed: 0x00002000
Type: Core Number of child objects: 2
Name=NULL
Cpuset: 0x0000c000
Online: 0x0000c000
Allowed: 0x0000c000
Type: PU Number of child objects: 0
Name=NULL
Cpuset: 0x00004000
Online: 0x00004000
Allowed: 0x00004000
Type: PU Number of child objects: 0
Name=NULL
Cpuset: 0x00008000
Online: 0x00008000
Allowed: 0x00008000
--------------------------------------------------------------------------
mpiexec noticed that process rank 0 with PID 7168 on node
rs0.informatik.hs-fulda.de exited on signal 11 (Segmentation Fault).
--------------------------------------------------------------------------
[rs0:07168] *** Process received signal ***
[rs0:07168] Signal: Segmentation Fault (11)
[rs0:07168] Signal code: Invalid permissions (2)
[rs0:07168] Failing at address: ffffffff7ee32090
/usr/local/openmpi-1.9_64_cc/lib64/libopen-rte.so.0.0.0:opal_backtrace_print+0x14
/usr/local/openmpi-1.9_64_cc/lib64/libopen-rte.so.0.0.0:0x503e30
/lib/sparcv9/libc.so.1:0xd8684
/lib/sparcv9/libc.so.1:0xcc1f8
/lib/sparcv9/libc.so.1:0xcc404
/usr/local/openmpi-1.9_64_cc/lib64/libopen-rte.so.0.0.0:0x572ecc [ Signal
2128894776 (?)]
/usr/local/openmpi-1.9_64_cc/lib64/libopen-rte.so.0.0.0:opal_hwloc_base_cset2str+0x64
/usr/local/openmpi-1.9_64_cc/lib64/openmpi/mca_odls_default.so:0x126f8
/usr/local/openmpi-1.9_64_cc/lib64/openmpi/mca_odls_default.so:0x135f0
/usr/local/openmpi-1.9_64_cc/lib64/libopen-rte.so.0.0.0:orte_odls_base_default_launch_local+0x1e6c
/usr/local/openmpi-1.9_64_cc/lib64/libopen-rte.so.0.0.0:0x53468c
/usr/local/openmpi-1.9_64_cc/lib64/libopen-rte.so.0.0.0:0x5348b8
/usr/local/openmpi-1.9_64_cc/lib64/libopen-rte.so.0.0.0:opal_libevent2019_event_base_loop+0x1e8
/usr/local/openmpi-1.9_64_cc/bin/orterun:orterun+0x1ce4
/usr/local/openmpi-1.9_64_cc/bin/orterun:main+0x24
/usr/local/openmpi-1.9_64_cc/bin/orterun:_start+0x12c
[rs0:07168] *** End of error message ***
[rs0:07170] *** Process received signal ***
[rs0:07170] Signal: Segmentation Fault (11)
[rs0:07170] Signal code: Address not mapped (1)
[rs0:07170] Failing at address: 0
/usr/local/openmpi-1.9_64_cc/lib64/libopen-rte.so.0.0.0:opal_backtrace_print+0x14
/usr/local/openmpi-1.9_64_cc/lib64/libopen-rte.so.0.0.0:0x503e30
/lib/sparcv9/libc.so.1:0xd8684
/lib/sparcv9/libc.so.1:0xcc1f8
/lib/sparcv9/libc.so.1:0xcc404
/usr/local/openmpi-1.9_64_cc/lib64/libopen-rte.so.0.0.0:0x572eb0 [ Signal
2128894752 (?)]
/usr/local/openmpi-1.9_64_cc/lib64/libopen-rte.so.0.0.0:opal_hwloc_base_cset2str+0x64
/usr/local/openmpi-1.9_64_cc/lib64/openmpi/mca_odls_default.so:0x126f8
/usr/local/openmpi-1.9_64_cc/lib64/openmpi/mca_odls_default.so:0x135f0
/usr/local/openmpi-1.9_64_cc/lib64/libopen-rte.so.0.0.0:orte_odls_base_default_launch_local+0x1e6c
/usr/local/openmpi-1.9_64_cc/lib64/libopen-rte.so.0.0.0:0x53468c
/usr/local/openmpi-1.9_64_cc/lib64/libopen-rte.so.0.0.0:0x5348b8
/usr/local/openmpi-1.9_64_cc/lib64/libopen-rte.so.0.0.0:opal_libevent2019_event_base_loop+0x1e8
/usr/local/openmpi-1.9_64_cc/bin/orterun:orterun+0x1ce4
/usr/local/openmpi-1.9_64_cc/bin/orterun:main+0x24
/usr/local/openmpi-1.9_64_cc/bin/orterun:_start+0x12c
[rs0:07170] *** End of error message ***
[rs0.informatik.hs-fulda.de:07171] mca:base:select:( ess) Querying component
[env]
[rs0.informatik.hs-fulda.de:07171] mca:base:select:( ess) Skipping component
[env]. Query failed to return a module
[rs0.informatik.hs-fulda.de:07171] mca:base:select:( ess) Querying component
[hnp]
[rs0.informatik.hs-fulda.de:07171] mca:base:select:( ess) Query of component
[hnp] set priority to 100
[rs0.informatik.hs-fulda.de:07171] mca:base:select:( ess) Querying component
[singleton]
[rs0.informatik.hs-fulda.de:07171] mca:base:select:( ess) Skipping component
[singleton]. Query failed to return a module
[rs0.informatik.hs-fulda.de:07171] mca:base:select:( ess) Querying component
[tool]
[rs0.informatik.hs-fulda.de:07171] mca:base:select:( ess) Skipping component
[tool]. Query failed to return a module
[rs0.informatik.hs-fulda.de:07171] mca:base:select:( ess) Selected component
[hnp]
[rs0.informatik.hs-fulda.de:07171] [[INVALID],INVALID] Topology Info:
[rs0.informatik.hs-fulda.de:07171] Type: Machine Number of child objects: 1
Name=NULL
total=33554432KB
OSName=SunOS
OSRelease=5.10
OSVersion=Generic_147440-21
Architecture=sun4u
Cpuset: 0x0000ffff
Online: 0x0000ffff
Allowed: 0x0000ffff
Bind CPU proc: TRUE
Bind CPU thread: TRUE
Bind MEM proc: TRUE
Bind MEM thread: TRUE
Type: NUMANode Number of child objects: 2
Name=NULL
local=33554432KB
total=33554432KB
Cpuset: 0x0000ffff
Online: 0x0000ffff
Allowed: 0x0000ffff
Type: Socket Number of child objects: 4
Name=NULL
CPUType=sparcv9
CPUModel=SPARC64_VII
Cpuset: 0x000000ff
Online: 0x000000ff
Allowed: 0x000000ff
Type: Core Number of child objects: 2
Name=NULL
Cpuset: 0x00000003
Online: 0x00000003
Allowed: 0x00000003
Type: PU Number of child objects: 0
Name=NULL
Cpuset: 0x00000001
Online: 0x00000001
Allowed: 0x00000001
Type: PU Number of child objects: 0
Name=NULL
Cpuset: 0x00000002
Online: 0x00000002
Allowed: 0x00000002
Type: Core Number of child objects: 2
Name=NULL
Cpuset: 0x0000000c
Online: 0x0000000c
Allowed: 0x0000000c
Type: PU Number of child objects: 0
Name=NULL
Cpuset: 0x00000004
Online: 0x00000004
Allowed: 0x00000004
Type: PU Number of child objects: 0
Name=NULL
Cpuset: 0x00000008
Online: 0x00000008
Allowed: 0x00000008
Type: Core Number of child objects: 2
Name=NULL
Cpuset: 0x00000030
Online: 0x00000030
Allowed: 0x00000030
Type: PU Number of child objects: 0
Name=NULL
Cpuset: 0x00000010
Online: 0x00000010
Allowed: 0x00000010
Type: PU Number of child objects: 0
Name=NULL
Cpuset: 0x00000020
Online: 0x00000020
Allowed: 0x00000020
Type: Core Number of child objects: 2
Name=NULL
Cpuset: 0x000000c0
Online: 0x000000c0
Allowed: 0x000000c0
Type: PU Number of child objects: 0
Name=NULL
Cpuset: 0x00000040
Online: 0x00000040
Allowed: 0x00000040
Type: PU Number of child objects: 0
Name=NULL
Cpuset: 0x00000080
Online: 0x00000080
Allowed: 0x00000080
Type: Socket Number of child objects: 4
Name=NULL
CPUType=sparcv9
CPUModel=SPARC64_VII
Cpuset: 0x0000ff00
Online: 0x0000ff00
Allowed: 0x0000ff00
Type: Core Number of child objects: 2
Name=NULL
Cpuset: 0x00000300
Online: 0x00000300
Allowed: 0x00000300
Type: PU Number of child objects: 0
Name=NULL
Cpuset: 0x00000100
Online: 0x00000100
Allowed: 0x00000100
Type: PU Number of child objects: 0
Name=NULL
Cpuset: 0x00000200
Online: 0x00000200
Allowed: 0x00000200
Type: Core Number of child objects: 2
Name=NULL
Cpuset: 0x00000c00
Online: 0x00000c00
Allowed: 0x00000c00
Type: PU Number of child objects: 0
Name=NULL
Cpuset: 0x00000400
Online: 0x00000400
Allowed: 0x00000400
Type: PU Number of child objects: 0
Name=NULL
Cpuset: 0x00000800
Online: 0x00000800
Allowed: 0x00000800
Type: Core Number of child objects: 2
Name=NULL
Cpuset: 0x00003000
Online: 0x00003000
Allowed: 0x00003000
Type: PU Number of child objects: 0
Name=NULL
Cpuset: 0x00001000
Online: 0x00001000
Allowed: 0x00001000
Type: PU Number of child objects: 0
Name=NULL
Cpuset: 0x00002000
Online: 0x00002000
Allowed: 0x00002000
Type: Core Number of child objects: 2
Name=NULL
Cpuset: 0x0000c000
Online: 0x0000c000
Allowed: 0x0000c000
Type: PU Number of child objects: 0
Name=NULL
Cpuset: 0x00004000
Online: 0x00004000
Allowed: 0x00004000
Type: PU Number of child objects: 0
Name=NULL
Cpuset: 0x00008000
Online: 0x00008000
Allowed: 0x00008000
--------------------------------------------------------------------------
mpiexec noticed that process rank 0 with PID 7173 on node
rs0.informatik.hs-fulda.de exited on signal 11 (Segmentation Fault).
--------------------------------------------------------------------------
[rs0:07173] *** Process received signal ***
[rs0:07173] Signal: Segmentation Fault (11)
[rs0:07173] Signal code: Invalid permissions (2)
[rs0:07173] Failing at address: ffffffff7ee32090
/usr/local/openmpi-1.9_64_cc/lib64/libopen-rte.so.0.0.0:opal_backtrace_print+0x14
/usr/local/openmpi-1.9_64_cc/lib64/libopen-rte.so.0.0.0:0x503e30
/lib/sparcv9/libc.so.1:0xd8684
/lib/sparcv9/libc.so.1:0xcc1f8
/lib/sparcv9/libc.so.1:0xcc404
/usr/local/openmpi-1.9_64_cc/lib64/libopen-rte.so.0.0.0:0x572ecc [ Signal
2128894776 (?)]
/usr/local/openmpi-1.9_64_cc/lib64/libopen-rte.so.0.0.0:opal_hwloc_base_cset2str+0x64
/usr/local/openmpi-1.9_64_cc/lib64/openmpi/mca_odls_default.so:0x126f8
/usr/local/openmpi-1.9_64_cc/lib64/openmpi/mca_odls_default.so:0x135f0
/usr/local/openmpi-1.9_64_cc/lib64/libopen-rte.so.0.0.0:orte_odls_base_default_launch_local+0x1e6c
/usr/local/openmpi-1.9_64_cc/lib64/libopen-rte.so.0.0.0:0x53468c
/usr/local/openmpi-1.9_64_cc/lib64/libopen-rte.so.0.0.0:0x5348b8
/usr/local/openmpi-1.9_64_cc/lib64/libopen-rte.so.0.0.0:opal_libevent2019_event_base_loop+0x1e8
/usr/local/openmpi-1.9_64_cc/bin/orterun:orterun+0x1ce4
/usr/local/openmpi-1.9_64_cc/bin/orterun:main+0x24
/usr/local/openmpi-1.9_64_cc/bin/orterun:_start+0x12c
[rs0:07173] *** End of error message ***
[rs0:07175] *** Process received signal ***
[rs0:07175] Signal: Segmentation Fault (11)
[rs0:07175] Signal code: Address not mapped (1)
[rs0:07175] Failing at address: 0
/usr/local/openmpi-1.9_64_cc/lib64/libopen-rte.so.0.0.0:opal_backtrace_print+0x14
/usr/local/openmpi-1.9_64_cc/lib64/libopen-rte.so.0.0.0:0x503e30
/lib/sparcv9/libc.so.1:0xd8684
/lib/sparcv9/libc.so.1:0xcc1f8
/lib/sparcv9/libc.so.1:0xcc404
/usr/local/openmpi-1.9_64_cc/lib64/libopen-rte.so.0.0.0:0x572eb0 [ Signal
2128894752 (?)]
/usr/local/openmpi-1.9_64_cc/lib64/libopen-rte.so.0.0.0:opal_hwloc_base_cset2str+0x64
/usr/local/openmpi-1.9_64_cc/lib64/openmpi/mca_odls_default.so:0x126f8
/usr/local/openmpi-1.9_64_cc/lib64/openmpi/mca_odls_default.so:0x135f0
/usr/local/openmpi-1.9_64_cc/lib64/libopen-rte.so.0.0.0:orte_odls_base_default_launch_local+0x1e6c
/usr/local/openmpi-1.9_64_cc/lib64/libopen-rte.so.0.0.0:0x53468c
/usr/local/openmpi-1.9_64_cc/lib64/libopen-rte.so.0.0.0:0x5348b8
/usr/local/openmpi-1.9_64_cc/lib64/libopen-rte.so.0.0.0:opal_libevent2019_event_base_loop+0x1e8
/usr/local/openmpi-1.9_64_cc/bin/orterun:orterun+0x1ce4
/usr/local/openmpi-1.9_64_cc/bin/orterun:main+0x24
/usr/local/openmpi-1.9_64_cc/bin/orterun:_start+0x12c
[rs0:07175] *** End of error message ***
[rs0.informatik.hs-fulda.de:07190] mca:base:select:( ess) Querying component
[env]
[rs0.informatik.hs-fulda.de:07190] mca:base:select:( ess) Skipping component
[env]. Query failed to return a module
[rs0.informatik.hs-fulda.de:07190] mca:base:select:( ess) Querying component
[hnp]
[rs0.informatik.hs-fulda.de:07190] mca:base:select:( ess) Query of component
[hnp] set priority to 100
[rs0.informatik.hs-fulda.de:07190] mca:base:select:( ess) Querying component
[singleton]
[rs0.informatik.hs-fulda.de:07190] mca:base:select:( ess) Skipping component
[singleton]. Query failed to return a module
[rs0.informatik.hs-fulda.de:07190] mca:base:select:( ess) Querying component
[tool]
[rs0.informatik.hs-fulda.de:07190] mca:base:select:( ess) Skipping component
[tool]. Query failed to return a module
[rs0.informatik.hs-fulda.de:07190] mca:base:select:( ess) Selected component
[hnp]
[rs0.informatik.hs-fulda.de:07190] [[INVALID],INVALID] Topology Info:
[rs0.informatik.hs-fulda.de:07190] Type: Machine Number of child objects: 1
Name=NULL
total=33554432KB
OSName=SunOS
OSRelease=5.10
OSVersion=Generic_147440-21
Architecture=sun4u
Cpuset: 0x0000ffff
Online: 0x0000ffff
Allowed: 0x0000ffff
Bind CPU proc: TRUE
Bind CPU thread: TRUE
Bind MEM proc: TRUE
Bind MEM thread: TRUE
Type: NUMANode Number of child objects: 2
Name=NULL
local=33554432KB
total=33554432KB
Cpuset: 0x0000ffff
Online: 0x0000ffff
Allowed: 0x0000ffff
Type: Socket Number of child objects: 4
Name=NULL
CPUType=sparcv9
CPUModel=SPARC64_VII
Cpuset: 0x000000ff
Online: 0x000000ff
Allowed: 0x000000ff
Type: Core Number of child objects: 2
Name=NULL
Cpuset: 0x00000003
Online: 0x00000003
Allowed: 0x00000003
Type: PU Number of child objects: 0
Name=NULL
Cpuset: 0x00000001
Online: 0x00000001
Allowed: 0x00000001
Type: PU Number of child objects: 0
Name=NULL
Cpuset: 0x00000002
Online: 0x00000002
Allowed: 0x00000002
Type: Core Number of child objects: 2
Name=NULL
Cpuset: 0x0000000c
Online: 0x0000000c
Allowed: 0x0000000c
Type: PU Number of child objects: 0
Name=NULL
Cpuset: 0x00000004
Online: 0x00000004
Allowed: 0x00000004
Type: PU Number of child objects: 0
Name=NULL
Cpuset: 0x00000008
Online: 0x00000008
Allowed: 0x00000008
Type: Core Number of child objects: 2
Name=NULL
Cpuset: 0x00000030
Online: 0x00000030
Allowed: 0x00000030
Type: PU Number of child objects: 0
Name=NULL
Cpuset: 0x00000010
Online: 0x00000010
Allowed: 0x00000010
Type: PU Number of child objects: 0
Name=NULL
Cpuset: 0x00000020
Online: 0x00000020
Allowed: 0x00000020
Type: Core Number of child objects: 2
Name=NULL
Cpuset: 0x000000c0
Online: 0x000000c0
Allowed: 0x000000c0
Type: PU Number of child objects: 0
Name=NULL
Cpuset: 0x00000040
Online: 0x00000040
Allowed: 0x00000040
Type: PU Number of child objects: 0
Name=NULL
Cpuset: 0x00000080
Online: 0x00000080
Allowed: 0x00000080
Type: Socket Number of child objects: 4
Name=NULL
CPUType=sparcv9
CPUModel=SPARC64_VII
Cpuset: 0x0000ff00
Online: 0x0000ff00
Allowed: 0x0000ff00
Type: Core Number of child objects: 2
Name=NULL
Cpuset: 0x00000300
Online: 0x00000300
Allowed: 0x00000300
Type: PU Number of child objects: 0
Name=NULL
Cpuset: 0x00000100
Online: 0x00000100
Allowed: 0x00000100
Type: PU Number of child objects: 0
Name=NULL
Cpuset: 0x00000200
Online: 0x00000200
Allowed: 0x00000200
Type: Core Number of child objects: 2
Name=NULL
Cpuset: 0x00000c00
Online: 0x00000c00
Allowed: 0x00000c00
Type: PU Number of child objects: 0
Name=NULL
Cpuset: 0x00000400
Online: 0x00000400
Allowed: 0x00000400
Type: PU Number of child objects: 0
Name=NULL
Cpuset: 0x00000800
Online: 0x00000800
Allowed: 0x00000800
Type: Core Number of child objects: 2
Name=NULL
Cpuset: 0x00003000
Online: 0x00003000
Allowed: 0x00003000
Type: PU Number of child objects: 0
Name=NULL
Cpuset: 0x00001000
Online: 0x00001000
Allowed: 0x00001000
Type: PU Number of child objects: 0
Name=NULL
Cpuset: 0x00002000
Online: 0x00002000
Allowed: 0x00002000
Type: Core Number of child objects: 2
Name=NULL
Cpuset: 0x0000c000
Online: 0x0000c000
Allowed: 0x0000c000
Type: PU Number of child objects: 0
Name=NULL
Cpuset: 0x00004000
Online: 0x00004000
Allowed: 0x00004000
Type: PU Number of child objects: 0
Name=NULL
Cpuset: 0x00008000
Online: 0x00008000
Allowed: 0x00008000
[rs0.informatik.hs-fulda.de:07190] MCW rank 0 bound to :
[B./../../..][../../../..]
Wed Sep 26 12:43:09 CEST 2012
[rs0:07198] *** Process received signal ***
[rs0:07198] Signal: Bus Error (10)
[rs0:07198] Signal code: Invalid address alignment (1)
[rs0:07198] Failing at address: 3a
/usr/local/openmpi-1.9_64_cc/lib64/libopen-rte.so.0.0.0:opal_backtrace_print+0x14
/usr/local/openmpi-1.9_64_cc/lib64/libopen-rte.so.0.0.0:0x503e30
/lib/sparcv9/libc.so.1:0xd8684
/lib/sparcv9/libc.so.1:0xcc1f8
/lib/sparcv9/libc.so.1:0xcc404
/usr/local/openmpi-1.9_64_cc/lib64/libopen-rte.so.0.0.0:0x572eb0 [ Signal
2128894800 (?)]
/usr/local/openmpi-1.9_64_cc/lib64/libopen-rte.so.0.0.0:opal_hwloc_base_cset2str+0x64
/usr/local/openmpi-1.9_64_cc/lib64/openmpi/mca_odls_default.so:0x126f8
/usr/local/openmpi-1.9_64_cc/lib64/openmpi/mca_odls_default.so:0x135f0
/usr/local/openmpi-1.9_64_cc/lib64/libopen-rte.so.0.0.0:orte_odls_base_default_launch_local+0x1e6c
/usr/local/openmpi-1.9_64_cc/lib64/libopen-rte.so.0.0.0:0x53468c
/usr/local/openmpi-1.9_64_cc/lib64/libopen-rte.so.0.0.0:0x5348b8
/usr/local/openmpi-1.9_64_cc/lib64/libopen-rte.so.0.0.0:opal_libevent2019_event_base_loop+0x1e8
/usr/local/openmpi-1.9_64_cc/bin/orterun:orterun+0x1ce4
/usr/local/openmpi-1.9_64_cc/bin/orterun:main+0x24
/usr/local/openmpi-1.9_64_cc/bin/orterun:_start+0x12c
[rs0:07198] *** End of error message ***
--------------------------------------------------------------------------
mpiexec noticed that process rank 0 with PID 7198 on node
rs0.informatik.hs-fulda.de exited on signal 10 (Bus Error).
--------------------------------------------------------------------------