It's not hard to test whether or not SELinux is the problem. You can turn SELinux off on the command-line with this command:
setenforce 0 Of course, you need to be root in order to do this. After turning SELinux off, you can try reproducing the error. If it still occurs, it's SELinux, if it doesn't the problem is elswhere. When your done, you can reenable SELinux with setenforce 1 If you're running your job across multiple nodes, you should disable SELinux on all of them for testing. Did you compile/install Open MPI yourself? If so, I suspect that you have the SELinux context labels on your MPI binaries are incorrect. If you use the method above to determine that SELinux is the problem, please post your results here and I may be able to help you set things right. I have some experience with SELinux problems like this, but I'm not exactly an expert. -- Prentice On 03/17/2011 11:01 AM, Jeff Squyres wrote: > Sorry for the delayed reply. > > I'm afraid I haven't done much with SE Linux -- I don't know if there are any > "gotchas" that would show up there. SE Linux support is not something we've > gotten a lot of request for. I doubt that anyone in the community has done > much testing in this area. :-\ > > I suspect that Open MPI is trying to access something that your user (under > SE Linux) doesn't have permission to. > > So I'm afraid I don't have much of an answer for you -- sorry! If you do > figure it out, though, if a fix is not too intrusive, we can probably > incorporate it upstream. > > > On Mar 4, 2011, at 7:31 AM, Youri LACAN-BARTLEY wrote: > >> Hi, >> >> This is my first post to this mailing-list so I apologize for maybe being a >> little rough on the edges. >> I’ve been digging into OpenMPI for a little while now and have come across >> one issue that I just can’t explain and I’m sincerely hoping someone can put >> me on the right track here. >> >> I’m using a fresh install of openmpi-1.2.7 and I systematically get a >> segmentation fault at the end of my mpirun calls if I’m logged in as a >> regular user. >> However, as soon as I switch to the root account, the segfault does not >> appear. >> The jobs actually run to their term but I just can’t find a good reason for >> this to be happening and I haven’t been able to reproduce the problem on >> another machine. >> >> Any help or tips would be greatly appreciated. >> >> Thanks, >> >> Youri LACAN-BARTLEY >> >> Here’s an example running osu_latency locally (I’ve “blacklisted” openib to >> make sure it’s not to blame): >> >> [user@server ~]$ mpirun --mca btl ^openib -np 2 >> /opt/scripts/osu_latency-openmpi-1.2.7 >> # OSU MPI Latency Test v3.3 >> # Size Latency (us) >> 0 0.76 >> 1 0.89 >> 2 0.89 >> 4 0.89 >> 8 0.89 >> 16 0.91 >> 32 0.91 >> 64 0.92 >> 128 0.96 >> 256 1.13 >> 512 1.31 >> 1024 1.69 >> 2048 2.51 >> 4096 5.34 >> 8192 9.16 >> 16384 17.47 >> 32768 31.79 >> 65536 51.10 >> 131072 92.41 >> 262144 181.74 >> 524288 512.26 >> 1048576 1238.21 >> 2097152 2280.28 >> 4194304 4616.67 >> [server:15586] *** Process received signal *** >> [server:15586] Signal: Segmentation fault (11) >> [server:15586] Signal code: Address not mapped (1) >> [server:15586] Failing at address: (nil) >> [server:15586] [ 0] /lib64/libpthread.so.0 [0x3cd1e0eb10] >> [server:15586] [ 1] /lib64/libc.so.6 [0x3cd166fdc9] >> [server:15586] [ 2] /lib64/libc.so.6(__libc_malloc+0x167) [0x3cd1674dd7] >> [server:15586] [ 3] /lib64/ld-linux-x86-64.so.2(__tls_get_addr+0xb1) >> [0x3cd120fe61] >> [server:15586] [ 4] /lib64/libselinux.so.1 [0x3cd320f5cc] >> [server:15586] [ 5] /lib64/libselinux.so.1 [0x3cd32045df] >> [server:15586] *** End of error message *** >> [server:15587] *** Process received signal *** >> [server:15587] Signal: Segmentation fault (11) >> [server:15587] Signal code: Address not mapped (1) >> [server:15587] Failing at address: (nil) >> [server:15587] [ 0] /lib64/libpthread.so.0 [0x3cd1e0eb10] >> [server:15587] [ 1] /lib64/libc.so.6 [0x3cd166fdc9] >> [server:15587] [ 2] /lib64/libc.so.6(__libc_malloc+0x167) [0x3cd1674dd7] >> [server:15587] [ 3] /lib64/ld-linux-x86-64.so.2(__tls_get_addr+0xb1) >> [0x3cd120fe61] >> [server:15587] [ 4] /lib64/libselinux.so.1 [0x3cd320f5cc] >> [server:15587] [ 5] /lib64/libselinux.so.1 [0x3cd32045df] >> [server:15587] *** End of error message *** >> mpirun noticed that job rank 0 with PID 15586 on node server exited on >> signal 11 (Segmentation fault). >> 1 additional process aborted (not shown) >> [server:15583] *** Process received signal *** >> [server:15583] Signal: Segmentation fault (11) >> [server:15583] Signal code: Address not mapped (1) >> [server:15583] Failing at address: (nil) >> [server:15583] [ 0] /lib64/libpthread.so.0 [0x3cd1e0eb10] >> [server:15583] [ 1] /lib64/libc.so.6 [0x3cd166fdc9] >> [server:15583] [ 2] /lib64/libc.so.6(__libc_malloc+0x167) [0x3cd1674dd7] >> [server:15583] [ 3] /lib64/ld-linux-x86-64.so.2(__tls_get_addr+0xb1) >> [0x3cd120fe61] >> [server:15583] [ 4] /lib64/libselinux.so.1 [0x3cd320f5cc] >> [server:15583] [ 5] /lib64/libselinux.so.1 [0x3cd32045df] >> [server:15583] *** End of error message *** >> Segmentation fault >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users > >