Sorry for the delayed reply.

I'm afraid I haven't done much with SE Linux -- I don't know if there are any 
"gotchas" that would show up there.  SE Linux support is not something we've 
gotten a lot of request for.  I doubt that anyone in the community has done 
much testing in this area.  :-\

I suspect that Open MPI is trying to access something that your user (under SE 
Linux) doesn't have permission to.  

So I'm afraid I don't have much of an answer for you -- sorry!  If you do 
figure it out, though, if a fix is not too intrusive, we can probably 
incorporate it upstream.


On Mar 4, 2011, at 7:31 AM, Youri LACAN-BARTLEY wrote:

> Hi,
>  
> This is my first post to this mailing-list so I apologize for maybe being a 
> little rough on the edges.
> I’ve been digging into OpenMPI for a little while now and have come across 
> one issue that I just can’t explain and I’m sincerely hoping someone can put 
> me on the right track here.
>  
> I’m using a fresh install of openmpi-1.2.7 and I systematically get a 
> segmentation fault at the end of my mpirun calls if I’m logged in as a 
> regular user.
> However, as soon as I switch to the root account, the segfault does not 
> appear.
> The jobs actually run to their term but I just can’t find a good reason for 
> this to be happening and I haven’t been able to reproduce the problem on 
> another machine.
>  
> Any help or tips would be greatly appreciated.
>  
> Thanks,
>  
> Youri LACAN-BARTLEY
>  
> Here’s an example running osu_latency locally (I’ve “blacklisted” openib to 
> make sure it’s not to blame):
>  
> [user@server ~]$ mpirun --mca btl ^openib  -np 2 
> /opt/scripts/osu_latency-openmpi-1.2.7
> # OSU MPI Latency Test v3.3
> # Size            Latency (us)
> 0                         0.76
> 1                         0.89
> 2                         0.89
> 4                         0.89
> 8                         0.89
> 16                        0.91
> 32                        0.91
> 64                        0.92
> 128                       0.96
> 256                       1.13
> 512                       1.31
> 1024                      1.69
> 2048                      2.51
> 4096                      5.34
> 8192                      9.16
> 16384                    17.47
> 32768                    31.79
> 65536                    51.10
> 131072                   92.41
> 262144                  181.74
> 524288                  512.26
> 1048576                1238.21
> 2097152                2280.28
> 4194304                4616.67
> [server:15586] *** Process received signal ***
> [server:15586] Signal: Segmentation fault (11)
> [server:15586] Signal code: Address not mapped (1)
> [server:15586] Failing at address: (nil)
> [server:15586] [ 0] /lib64/libpthread.so.0 [0x3cd1e0eb10]
> [server:15586] [ 1] /lib64/libc.so.6 [0x3cd166fdc9]
> [server:15586] [ 2] /lib64/libc.so.6(__libc_malloc+0x167) [0x3cd1674dd7]
> [server:15586] [ 3] /lib64/ld-linux-x86-64.so.2(__tls_get_addr+0xb1) 
> [0x3cd120fe61]
> [server:15586] [ 4] /lib64/libselinux.so.1 [0x3cd320f5cc]
> [server:15586] [ 5] /lib64/libselinux.so.1 [0x3cd32045df]
> [server:15586] *** End of error message ***
> [server:15587] *** Process received signal ***
> [server:15587] Signal: Segmentation fault (11)
> [server:15587] Signal code: Address not mapped (1)
> [server:15587] Failing at address: (nil)
> [server:15587] [ 0] /lib64/libpthread.so.0 [0x3cd1e0eb10]
> [server:15587] [ 1] /lib64/libc.so.6 [0x3cd166fdc9]
> [server:15587] [ 2] /lib64/libc.so.6(__libc_malloc+0x167) [0x3cd1674dd7]
> [server:15587] [ 3] /lib64/ld-linux-x86-64.so.2(__tls_get_addr+0xb1) 
> [0x3cd120fe61]
> [server:15587] [ 4] /lib64/libselinux.so.1 [0x3cd320f5cc]
> [server:15587] [ 5] /lib64/libselinux.so.1 [0x3cd32045df]
> [server:15587] *** End of error message ***
> mpirun noticed that job rank 0 with PID 15586 on node server exited on signal 
> 11 (Segmentation fault).
> 1 additional process aborted (not shown)
> [server:15583] *** Process received signal ***
> [server:15583] Signal: Segmentation fault (11)
> [server:15583] Signal code: Address not mapped (1)
> [server:15583] Failing at address: (nil)
> [server:15583] [ 0] /lib64/libpthread.so.0 [0x3cd1e0eb10]
> [server:15583] [ 1] /lib64/libc.so.6 [0x3cd166fdc9]
> [server:15583] [ 2] /lib64/libc.so.6(__libc_malloc+0x167) [0x3cd1674dd7]
> [server:15583] [ 3] /lib64/ld-linux-x86-64.so.2(__tls_get_addr+0xb1) 
> [0x3cd120fe61]
> [server:15583] [ 4] /lib64/libselinux.so.1 [0x3cd320f5cc]
> [server:15583] [ 5] /lib64/libselinux.so.1 [0x3cd32045df]
> [server:15583] *** End of error message ***
> Segmentation fault
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/


Reply via email to