I've been having similar issues with brand new FC5/6 and RHEL5 machines, but our FC4/RHEL4 machines are just fine. On the FC5/6 RHEL5 machines, I can get things to run as root. There must be some ACL or security setting issue that's enabled by default on the newer distros. If I figure it out this weekend, I'll let you know. If anyone else knows the solution, please post to the list.

-Mike

David Bronke wrote:
I've been trying to get OpenMPI working on two of the computers at a
lab I help administer, and I'm running into a rather large issue. When
running anything using mpirun as a normal user, I get the following
output:


$ mpirun --no-daemonize --host
localhost,localhost,localhost,localhost,localhost,localhost,localhost,localhost
/workspace/bronke/mpi/hello
mpirun noticed that job rank 0 with PID 0 on node "localhost" exited
on signal 13.
[trixie:18104] ERROR: A daemon on node localhost failed to start as expected.
[trixie:18104] ERROR: There may be more information available from
[trixie:18104] ERROR: the remote shell (see above).
[trixie:18104] The daemon received a signal 13.
8 additional processes aborted (not shown)


However, running the same exact command line as root works fine:


$ sudo mpirun --no-daemonize --host
localhost,localhost,localhost,localhost,localhost,localhost,localhost,localhost
/workspace/bronke/mpi/hello
Password:
p is 8, my_rank is 0
p is 8, my_rank is 1
p is 8, my_rank is 2
p is 8, my_rank is 3
p is 8, my_rank is 6
p is 8, my_rank is 7
Greetings from process 1!

Greetings from process 2!

Greetings from process 3!

p is 8, my_rank is 5
p is 8, my_rank is 4
Greetings from process 4!

Greetings from process 5!

Greetings from process 6!

Greetings from process 7!


I've looked up signal 13, and have found that it is apparently
SIGPIPE; I also found a thread on the LAM-MPI site:
http://www.lam-mpi.org/MailArchives/lam/2004/08/8486.php
However, this thread seems to indicate that the problem would be in
the application, (/workspace/bronke/mpi/hello in this case) but there
are no pipes in use in this app, and the fact that it works as
expected as root doesn't seem to fit either. I have tried running
mpirun with --verbose and it doesn't show any more output than without
it, so I've run into a sort of dead-end on this issue. Does anyone
know of any way I can figure out what's going wrong or how I can fix
it?

Thanks!

Reply via email to