Aha!! I found this in our users mailing list archives:

http://www.open-mpi.org/community/lists/users/2012/01/18091.php

Looks like this is a known compiler vectorization issue.


On Jun 4, 2014, at 1:52 PM, Fischer, Greg A. <fisch...@westinghouse.com> wrote:

> Ralph,
> 
> Thanks for looking. Let me know if there's any other testing that I can do.
> 
> I recompiled with GCC and it works fine, so that lends credence to your 
> theory that it has something to do with the Intel compilers, and possibly 
> their interplay with SUSE.
> 
> Greg
> 
> -----Original Message-----
> From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Ralph Castain
> Sent: Wednesday, June 04, 2014 4:48 PM
> To: Open MPI Users
> Subject: Re: [OMPI users] intermittent segfaults with openib on ring_c.c
> 
> Urrrrrggg...unfortunately, the people who know the most about that code are 
> all at the MPI Forum this week, so we may not be able to fully address it 
> until their return. It looks like you are still going down into that malloc 
> interceptor, so I'm not correctly blocking it for you.
> 
> This run segfaulted in a completely different call in a different part of the 
> startup procedure - but in the same part of the interceptor, which makes me 
> suspicious. Don't know how much testing we've seen on SLES...
> 
> 
> On Jun 4, 2014, at 1:18 PM, Fischer, Greg A. <fisch...@westinghouse.com> 
> wrote:
> 
>> Ralph,
>> 
>> It segfaults. Here's the backtrace:
>> 
>> Core was generated by `ring_c'.
>> Program terminated with signal 11, Segmentation fault.
>> #0  opal_memory_ptmalloc2_int_malloc (av=0x2b82b5300020, 
>> bytes=47840385564856) at 
>> ../../../../../openmpi-1.8.1/opal/mca/memory/linux/malloc.c:4098
>> 4098          bck->fd = unsorted_chunks(av);
>> (gdb) bt
>> #0  opal_memory_ptmalloc2_int_malloc (av=0x2b82b5300020,
>> bytes=47840385564856) at
>> ../../../../../openmpi-1.8.1/opal/mca/memory/linux/malloc.c:4098
>> #1  0x00002b82b1a47e38 in opal_memory_ptmalloc2_malloc
>> (bytes=47840385564704) at
>> ../../../../../openmpi-1.8.1/opal/mca/memory/linux/malloc.c:3433
>> #2  0x00002b82b1a47b36 in opal_memory_linux_malloc_hook
>> (sz=47840385564704, caller=0x2b82b53000b8) at
>> ../../../../../openmpi-1.8.1/opal/mca/memory/linux/hooks.c:691
>> #3  0x00002b82b19e7b18 in opal_malloc (size=47840385564704,
>> file=0x2b82b53000b8 "", line=12) at
>> ../../../openmpi-1.8.1/opal/util/malloc.c:101
>> #4  0x00002b82b199c017 in opal_hash_table_set_value_uint64
>> (ht=0x2b82b5300020, key=47840385564856, value=0xc) at
>> ../../openmpi-1.8.1/opal/class/opal_hash_table.c:283
>> #5  0x00002b82b170e4ca in process_uri (uri=0x2b82b5300020 "\001") at
>> ../../../../openmpi-1.8.1/orte/mca/oob/base/oob_base_stubs.c:348
>> #6  0x00002b82b170e941 in orte_oob_base_set_addr (fd=-1255145440,
>> args=184, cbdata=0xc) at
>> ../../../../openmpi-1.8.1/orte/mca/oob/base/oob_base_stubs.c:296
>> #7  0x00002b82b19fba1c in event_process_active_single_queue
>> (base=0x655480, activeq=0x654920) at
>> ../../../../../../openmpi-1.8.1/opal/mca/event/libevent2021/libevent/e
>> vent.c:1367
>> #8  0x00002b82b19fbcd9 in event_process_active (base=0x655480) at
>> ../../../../../../openmpi-1.8.1/opal/mca/event/libevent2021/libevent/e
>> vent.c:1437
>> #9  0x00002b82b19fc4c3 in opal_libevent2021_event_base_loop
>> (base=0x655480, flags=1) at
>> ../../../../../../openmpi-1.8.1/opal/mca/event/libevent2021/libevent/e
>> vent.c:1645
>> #10 0x00002b82b16f8763 in orte_progress_thread_engine
>> (obj=0x2b82b5300020) at
>> ../../../../openmpi-1.8.1/orte/mca/ess/base/ess_base_std_app.c:456
>> #11 0x00002b82b0f1c7b6 in start_thread () from /lib64/libpthread.so.0
>> #12 0x00002b82b1410d6d in clone () from /lib64/libc.so.6
>> #13 0x0000000000000000 in ?? ()
>> 
>> Greg
>> 
>> -----Original Message-----
>> From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Ralph
>> Castain
>> Sent: Wednesday, June 04, 2014 3:49 PM
>> To: Open MPI Users
>> Subject: Re: [OMPI users] intermittent segfaults with openib on
>> ring_c.c
>> 
>> Sorry for delay - digging my way out of the backlog. This is very strange as 
>> you are failing in a simple asprintf call. We check that all the players are 
>> non-NULL, and it appears that you are failing to allocate the memory for the 
>> resulting (rather short) string.
>> 
>> I'm wondering if this is some strange interaction between SLES, the Intel 
>> compiler, and our malloc interceptor - or if there is some difference 
>> between the malloc libraries on the two machines. Let's try running it 
>> without the malloc interceptor and see if that helps.
>> 
>> Try running with "-mca memory ^linux" on your cmd line
>> 
>> 
>> On Jun 4, 2014, at 9:58 AM, Ralph Castain <r...@open-mpi.org> wrote:
>> 
>>> He isn't getting that far - he's failing in MPI_Init when the RTE
>>> attempts to connect to the local daemon
>>> 
>>> 
>>> On Jun 4, 2014, at 9:53 AM, Gus Correa <g...@ldeo.columbia.edu> wrote:
>>> 
>>>> Hi Greg
>>>> 
>>>> From your original email:
>>>> 
>>>>>> [binf102:fischega] $ mpirun -np 2 --mca btl openib,self ring_c
>>>> 
>>>> This may not fix the problem,
>>>> but have you tried to add the shared memory btl to your mca parameter?
>>>> 
>>>> mpirun -np 2 --mca btl openib,sm,self ring_c
>>>> 
>>>> As far as I know, sm is the preferred transport layer for intra-node
>>>> communication.
>>>> 
>>>> Gus Correa
>>>> 
>>>> 
>>>> On 06/04/2014 11:13 AM, Ralph Castain wrote:
>>>>> Thanks!! Really appreciate your help - I'll try to figure out what
>>>>> went wrong and get back to you
>>>>> 
>>>>> On Jun 4, 2014, at 8:07 AM, Fischer, Greg A.
>>>>> <fisch...@westinghouse.com <mailto:fisch...@westinghouse.com>> wrote:
>>>>> 
>>>>>> I re-ran with 1 processor and got more information. How about this?
>>>>>> Core was generated by `ring_c'.
>>>>>> Program terminated with signal 11, Segmentation fault.
>>>>>> #0  opal_memory_ptmalloc2_int_malloc (av=0x2b48f6300020,
>>>>>> bytes=47592367980728) at
>>>>>> ../../../../../openmpi-1.8.1/opal/mca/memory/linux/malloc.c:4098
>>>>>> 4098          bck->fd = unsorted_chunks(av);
>>>>>> (gdb) bt
>>>>>> #0  opal_memory_ptmalloc2_int_malloc (av=0x2b48f6300020,
>>>>>> bytes=47592367980728) at
>>>>>> ../../../../../openmpi-1.8.1/opal/mca/memory/linux/malloc.c:4098
>>>>>> #1  0x00002b48f2a15e38 in opal_memory_ptmalloc2_malloc
>>>>>> (bytes=47592367980576) at
>>>>>> ../../../../../openmpi-1.8.1/opal/mca/memory/linux/malloc.c:3433
>>>>>> #2  0x00002b48f2a15b36 in opal_memory_linux_malloc_hook
>>>>>> (sz=47592367980576, caller=0x2b48f63000b8) at
>>>>>> ../../../../../openmpi-1.8.1/opal/mca/memory/linux/hooks.c:691
>>>>>> #3  0x00002b48f2374b90 in vasprintf () from /lib64/libc.so.6
>>>>>> #4  0x00002b48f2354148 in asprintf () from /lib64/libc.so.6
>>>>>> #5  0x00002b48f26dc7d1 in orte_oob_base_get_addr
>>>>>> (uri=0x2b48f6300020) at
>>>>>> ../../../../openmpi-1.8.1/orte/mca/oob/base/oob_base_stubs.c:234
>>>>>> #6  0x00002b48f53e7d4a in orte_rml_oob_get_uri () at
>>>>>> ../../../../../openmpi-1.8.1/orte/mca/rml/oob/rml_oob_contact.c:36
>>>>>> #7  0x00002b48f26fa181 in orte_routed_base_register_sync (setup=32 '
>>>>>> ') at
>>>>>> ../../../../openmpi-1.8.1/orte/mca/routed/base/routed_base_fns.c:3
>>>>>> 0
>>>>>> 1
>>>>>> #8  0x00002b48f4bbcccf in init_routes (job=4130340896,
>>>>>> ndat=0x2b48f63000b8) at
>>>>>> ../../../../../openmpi-1.8.1/orte/mca/routed/binomial/routed_binom
>>>>>> i
>>>>>> al.c:705
>>>>>> #9  0x00002b48f26c615d in orte_ess_base_app_setup
>>>>>> (db_restrict_local=32 ' ') at
>>>>>> ../../../../openmpi-1.8.1/orte/mca/ess/base/ess_base_std_app.c:245
>>>>>> #10 0x00002b48f45b069f in rte_init () at
>>>>>> ../../../../../openmpi-1.8.1/orte/mca/ess/env/ess_env_module.c:146
>>>>>> #11 0x00002b48f26935ab in orte_init (pargc=0x2b48f6300020,
>>>>>> pargv=0x2b48f63000b8, flags=8) at
>>>>>> ../../openmpi-1.8.1/orte/runtime/orte_init.c:148
>>>>>> #12 0x00002b48f1739d38 in ompi_mpi_init (argc=1,
>>>>>> argv=0x7fffebf0d1f8, requested=8, provided=0x0) at
>>>>>> ../../openmpi-1.8.1/ompi/runtime/ompi_mpi_init.c:464
>>>>>> #13 0x00002b48f1760a37 in PMPI_Init (argc=0x2b48f6300020,
>>>>>> argv=0x2b48f63000b8) at pinit.c:84
>>>>>> #14 0x00000000004024ef in main (argc=1, argv=0x7fffebf0d1f8) at
>>>>>> ring_c.c:19
>>>>>> *From:*users [mailto:users-boun...@open-mpi.org]*On Behalf
>>>>>> Of*Ralph Castain *Sent:*Wednesday, June 04, 2014 11:00 AM
>>>>>> *To:*Open MPI Users
>>>>>> *Subject:*Re: [OMPI users] intermittent segfaults with openib on
>>>>>> ring_c.c Does the trace go any further back? Your prior trace
>>>>>> seemed to indicate an error in our OOB framework, but in a very basic 
>>>>>> place.
>>>>>> Looks like it could be an uninitialized variable, and having the
>>>>>> line number down as deep as possible might help identify the
>>>>>> source On Jun 4, 2014, at 7:55 AM, Fischer, Greg A.
>>>>>> <fisch...@westinghouse.com <mailto:fisch...@westinghouse.com>> wrote:
>>>>>> 
>>>>>> 
>>>>>> Oops, ulimit was set improperly. I generated a core file, loaded
>>>>>> it in GDB, and ran a backtrace:
>>>>>> Core was generated by `ring_c'.
>>>>>> Program terminated with signal 11, Segmentation fault.
>>>>>> #0  opal_memory_ptmalloc2_int_malloc (av=0x2b8e4fd00020,
>>>>>> bytes=47890224382136) at
>>>>>> ../../../../../openmpi-1.8.1/opal/mca/memory/linux/malloc.c:4098
>>>>>> 4098          bck->fd = unsorted_chunks(av);
>>>>>> (gdb) bt
>>>>>> #0  opal_memory_ptmalloc2_int_malloc (av=0x2b8e4fd00020,
>>>>>> bytes=47890224382136) at
>>>>>> ../../../../../openmpi-1.8.1/opal/mca/memory/linux/malloc.c:4098
>>>>>> #1  0x0000000000000000 in ?? ()
>>>>>> Is that helpful?
>>>>>> Greg
>>>>>> *From:*Fischer, Greg A.
>>>>>> *Sent:*Wednesday, June 04, 2014 10:17 AM *To:*'Open MPI Users'
>>>>>> *Cc:*Fischer, Greg A.
>>>>>> *Subject:*RE: [OMPI users] intermittent segfaults with openib on
>>>>>> ring_c.c I recompiled with "-enable-debug" but it doesn't seem to
>>>>>> be providing any more information or a core dump. I'm compiling ring.c 
>>>>>> with:
>>>>>> mpicc ring_c.c -g -traceback -o ring_c and running with:
>>>>>> mpirun -np 4 --mca btl openib,self ring_c and I'm getting:
>>>>>> [binf112:05845] *** Process received signal *** [binf112:05845]
>>>>>> Signal: Segmentation fault (11) [binf112:05845] Signal code:
>>>>>> Address not mapped (1) [binf112:05845] Failing at address: 0x10
>>>>>> [binf112:05845] [ 0]
>>>>>> /lib64/libpthread.so.0(+0xf7c0)[0x2b2fa44d57c0]
>>>>>> [binf112:05845] [ 1]
>>>>>> /xxxx/yyyy_ib/intel-12.1.0.233/toolset/openmpi-1.8.1/lib/libopen-p
>>>>>> a l.so.6(opal_memory_ptmalloc2_int_malloc+0x4b3)[0x2b2fa4ff2b03]
>>>>>> [binf112:05845] [ 2]
>>>>>> /xxxx/yyyy_ib/intel-12.1.0.233/toolset/openmpi-1.8.1/lib/libopen-p
>>>>>> a l.so.6(opal_memory_ptmalloc2_malloc+0x58)[0x2b2fa4ff5288]
>>>>>> [binf112:05845] [ 3]
>>>>>> /xxxx/yyyy_ib/intel-12.1.0.233/toolset/openmpi-1.8.1/lib/libopen-p
>>>>>> a l.so.6(+0xd1f86)[0x2b2fa4ff4f86] [binf112:05845] [ 4]
>>>>>> /lib64/libc.so.6(vasprintf+0x3e)[0x2b2fa4957a7e]
>>>>>> [binf112:05845] [ 5]
>>>>>> /lib64/libc.so.6(asprintf+0x88)[0x2b2fa4937148]
>>>>>> [binf112:05845] [ 6]
>>>>>> /xxxx/yyyy_ib/intel-12.1.0.233/toolset/openmpi-1.8.1/lib/libopen-r
>>>>>> t
>>>>>> e.so.7(orte_util_convert_process_name_to_string+0xe2)[0x2b2fa4c873
>>>>>> e
>>>>>> 2]
>>>>>> [binf112:05845] [ 7]
>>>>>> /xxxx/yyyy_ib/intel-12.1.0.233/toolset/openmpi-1.8.1/lib/libopen-r
>>>>>> t e.so.7(orte_oob_base_get_addr+0x25)[0x2b2fa4cbdb15]
>>>>>> [binf112:05845] [ 8]
>>>>>> /xxxx/yyyy_ib/intel-12.1.0.233/toolset/openmpi-1.8.1/lib/openmpi/m
>>>>>> c a_rml_oob.so(orte_rml_oob_get_uri+0xa)[0x2b2fa79c5d2a]
>>>>>> [binf112:05845] [ 9]
>>>>>> /xxxx/yyyy_ib/intel-12.1.0.233/toolset/openmpi-1.8.1/lib/libopen-r
>>>>>> t e.so.7(orte_routed_base_register_sync+0x1fd)[0x2b2fa4cdae7d]
>>>>>> [binf112:05845] [10]
>>>>>> /xxxx/yyyy_ib/intel-12.1.0.233/toolset/openmpi-1.8.1/lib/openmpi/m
>>>>>> c a_routed_binomial.so(+0x3c7b)[0x2b2fa719bc7b]
>>>>>> [binf112:05845] [11]
>>>>>> /xxxx/yyyy_ib/intel-12.1.0.233/toolset/openmpi-1.8.1/lib/libopen-r
>>>>>> t e.so.7(orte_ess_base_app_setup+0x3ad)[0x2b2fa4ca7c8d]
>>>>>> [binf112:05845] [12]
>>>>>> /xxxx/yyyy_ib/intel-12.1.0.233/toolset/openmpi-1.8.1/lib/openmpi/m
>>>>>> c a_ess_env.so(+0x169f)[0x2b2fa6b8f69f]
>>>>>> [binf112:05845] [13]
>>>>>> /xxxx/yyyy_ib/intel-12.1.0.233/toolset/openmpi-1.8.1/lib/libopen-r
>>>>>> t e.so.7(orte_init+0x17b)[0x2b2fa4c764bb]
>>>>>> [binf112:05845] [14]
>>>>>> /xxxx/yyyy_ib/intel-12.1.0.233/toolset/openmpi-1.8.1/lib/libmpi.so.
>>>>>> 1(ompi_mpi_init+0x438)[0x2b2fa3d1e198]
>>>>>> [binf112:05845] [15]
>>>>>> /xxxx/yyyy_ib/intel-12.1.0.233/toolset/openmpi-1.8.1/lib/libmpi.so.
>>>>>> 1(MPI_Init+0xf7)[0x2b2fa3d44947] [binf112:05845] [16]
>>>>>> ring_c[0x4024ef] [binf112:05845] [17]
>>>>>> /lib64/libc.so.6(__libc_start_main+0xe6)[0x2b2fa4906c36]
>>>>>> [binf112:05845] [18] ring_c[0x4023f9] [binf112:05845] *** End of
>>>>>> error message ***
>>>>>> ------------------------------------------------------------------
>>>>>> -
>>>>>> ------- mpirun noticed that process rank 3 with PID 5845 on node
>>>>>> xxxx112 exited on signal 11 (Segmentation fault).
>>>>>> ------------------------------------------------------------------
>>>>>> -
>>>>>> -------
>>>>>> Does any of that help?
>>>>>> Greg
>>>>>> *From:*users [mailto:users-boun...@open-mpi.org]*On Behalf
>>>>>> Of*Ralph Castain *Sent:*Tuesday, June 03, 2014 11:54 PM *To:*Open
>>>>>> MPI Users
>>>>>> *Subject:*Re: [OMPI users] intermittent segfaults with openib on
>>>>>> ring_c.c Sounds odd - can you configure OMPI --enable-debug and run it 
>>>>>> again?
>>>>>> If it fails and you can get a core dump, could you tell us the
>>>>>> line number where it is failing?
>>>>>> On Jun 3, 2014, at 9:58 AM, Fischer, Greg A.
>>>>>> <fisch...@westinghouse.com <mailto:fisch...@westinghouse.com>> wrote:
>>>>>> 
>>>>>> Apologies - I forgot to add some of the information requested by the FAQ:
>>>>>> 
>>>>>> 1.OpenFabrics is provided by the Linux distribution:
>>>>>> 
>>>>>> [binf102:fischega] $ rpm -qa | grep ofed
>>>>>> ofed-kmp-default-1.5.4.1_3.0.76_0.11-0.11.5
>>>>>> ofed-1.5.4.1-0.11.5
>>>>>> ofed-doc-1.5.4.1-0.11.5
>>>>>> 
>>>>>> 
>>>>>> 2.Linux Distro / Kernel:
>>>>>> 
>>>>>> [binf102:fischega] $ cat /etc/SuSE-release SUSE Linux Enterprise
>>>>>> Server 11 (x86_64) VERSION = 11 PATCHLEVEL = 3
>>>>>> 
>>>>>> [binf102:fischega] $ uname -a
>>>>>> Linux xxxx102 3.0.76-0.11-default #1 SMP Fri Jun 14 08:21:43 UTC
>>>>>> 2013
>>>>>> (ccab990) x86_64 x86_64 x86_64 GNU/Linux
>>>>>> 
>>>>>> 
>>>>>> 3.Not sure which subnet manger is being used - I think OpenSM, but
>>>>>> I'll need to check with my administrators.
>>>>>> 
>>>>>> 
>>>>>> 4.Output of ibv_devinfo is attached.
>>>>>> 
>>>>>> 
>>>>>> 5.Ifconfig output is attached.
>>>>>> 
>>>>>> 
>>>>>> 6.Ulimit -l output:
>>>>>> 
>>>>>> [binf102:fischega] $ ulimit -l
>>>>>> unlimited
>>>>>> 
>>>>>> Greg
>>>>>> 
>>>>>> 
>>>>>> *From:*Fischer, Greg A.
>>>>>> *Sent:*Tuesday, June 03, 2014 12:38 PM *To:*Open MPI Users
>>>>>> *Cc:*Fischer, Greg A.
>>>>>> *Subject:*intermittent segfaults with openib on ring_c.c Hello
>>>>>> openmpi-users, I'm running into a perplexing problem on a new
>>>>>> system, whereby I'm experiencing intermittent segmentation faults
>>>>>> when I run the ring_c.c example and use the openib BTL. See an
>>>>>> example below. Approximately 50% of the time it provides the
>>>>>> expected output, but the other 50% of the time, it segfaults.
>>>>>> LD_LIBRARY_PATH is set correctly, and the version of "mpirun"
>>>>>> being invoked is correct. The output of ompi_info -all is attached.
>>>>>> One potential problem may be that the system that OpenMPI was
>>>>>> compiled on is/mostly/the same as the system where it is being
>>>>>> executed, but there are some differences in the installed packages.
>>>>>> I've checked the critical ones (libibverbs, librdmacm,
>>>>>> libmlx4-rdmav2, etc.), and they appear to be the same.
>>>>>> Can anyone suggest how I might start tracking this problem down?
>>>>>> Thanks,
>>>>>> Greg
>>>>>> [binf102:fischega] $ mpirun -np 2 --mca btl openib,self ring_c
>>>>>> [binf102:31268] *** Process received signal *** [binf102:31268]
>>>>>> Signal: Segmentation fault (11) [binf102:31268] Signal code:
>>>>>> Address not mapped (1) [binf102:31268] Failing at address: 0x10
>>>>>> [binf102:31268] [ 0] /lib64/libpthread.so.0(+0xf7c0)
>>>>>> [0x2b42213f57c0] [binf102:31268] [ 1]
>>>>>> /xxxx/yyyy_ib/intel-12.1.0.233/toolset/openmpi-1.6.5/lib/libmpi.so.
>>>>>> 1(opal_memory_ptmalloc2_int_malloc+0x4b3)
>>>>>> [0x2b42203fd7e3]
>>>>>> [binf102:31268] [ 2]
>>>>>> /xxxx/yyyy_ib/intel-12.1.0.233/toolset/openmpi-1.6.5/lib/libmpi.so.
>>>>>> 1(opal_memory_ptmalloc2_int_memalign+0x8b)
>>>>>> [0x2b4220400d3b]
>>>>>> [binf102:31268] [ 3]
>>>>>> /xxxx/yyyy_ib/intel-12.1.0.233/toolset/openmpi-1.6.5/lib/libmpi.so.
>>>>>> 1(opal_memory_ptmalloc2_memalign+0x6f)
>>>>>> [0x2b42204008ef]
>>>>>> [binf102:31268] [ 4]
>>>>>> /xxxx/yyyy_ib/intel-12.1.0.233/toolset/openmpi-1.6.5/lib/libmpi.so.
>>>>>> 1(+0x117876)
>>>>>> [0x2b4220400876]
>>>>>> [binf102:31268] [ 5]
>>>>>> /xxxx/yyyy_ib/intel-12.1.0.233/toolset/openmpi-1.6.5/lib/openmpi/m
>>>>>> c
>>>>>> a_btl_openib.so(+0xc34c)
>>>>>> [0x2b422572334c]
>>>>>> [binf102:31268] [ 6]
>>>>>> /xxxx/yyyy_ib/intel-12.1.0.233/toolset/openmpi-1.6.5/lib/libmpi.so.
>>>>>> 1(opal_class_initialize+0xaa)
>>>>>> [0x2b422041d64a]
>>>>>> [binf102:31268] [ 7]
>>>>>> /xxxx/yyyy_ib/intel-12.1.0.233/toolset/openmpi-1.6.5/lib/openmpi/m
>>>>>> c
>>>>>> a_btl_openib.so(+0x1f12f)
>>>>>> [0x2b422573612f]
>>>>>> [binf102:31268] [ 8] /lib64/libpthread.so.0(+0x77b6)
>>>>>> [0x2b42213ed7b6] [binf102:31268] [ 9] /lib64/libc.so.6(clone+0x6d)
>>>>>> [0x2b42216dcd6d] [binf102:31268] *** End of error message ***
>>>>>> ------------------------------------------------------------------
>>>>>> -
>>>>>> ------- mpirun noticed that process rank 0 with PID 31268 on node
>>>>>> xxxx102 exited on signal 11 (Segmentation fault).
>>>>>> ------------------------------------------------------------------
>>>>>> -
>>>>>> -------
>>>>>> <ibv_devinfo.txt><ifconfig.txt>___________________________________
>>>>>> _
>>>>>> ___________
>>>>>> users mailing list
>>>>>> us...@open-mpi.org <mailto:us...@open-mpi.org>
>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>> _______________________________________________
>>>>>> users mailing list
>>>>>> us...@open-mpi.org <mailto:us...@open-mpi.org>
>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>> _______________________________________________
>>>>>> users mailing list
>>>>>> us...@open-mpi.org <mailto:us...@open-mpi.org>
>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>> 
>>>>> 
>>>>> 
>>>>> _______________________________________________
>>>>> users mailing list
>>>>> us...@open-mpi.org
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>> 
>>>> 
>>>> _______________________________________________
>>>> users mailing list
>>>> us...@open-mpi.org
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>> 
>> 
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>> 
>> 
>> 
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> 
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users

Reply via email to