Hi Andria

The problem is a permissions one - your system has been setup so that only root has permission to open a TCP socket. I don't know what system you are running - you might want to talk to your system admin or someone knowledgeable on that operating system to ask them how to revise the required permissions.

Ralph


On Mar 17, 2009, at 3:12 AM, -andria- wrote:

Dear all,

I am still learning how to create a parallel program with open-mpi.

I try to run a mpihello program on my cluster, but it gives error when it is executed as ordinary (public) user. however, it gives the correct result when it is run by root user.

why this happen? how can it be solved?

attached you can find ompi_info --all output.

the code:

#include "mpi.h"
#include "stdio.h"

int main(int argc, char** argv) {
   int numprocs, rank, namelen;
   char processor_name[MPI_MAX_PROCESSOR_NAME];

   MPI_Init(&argc, &argv);
   MPI_Comm_size(MPI_COMM_WORLD, &numprocs);
   MPI_Comm_rank(MPI_COMM_WORLD, &rank);
   MPI_Get_processor_name(processor_name, &namelen);
printf("Process %d on %s out of %d\n", rank, processor_name, numprocs);
   MPI_Finalize();

   return 0;
}

output:
[public@cisitu01 ~]$ mpicc mpihello.c -o mpihello

### as public ###
[public@cisitu01 ~]$ mpirun -np 4 -hostfile nodes.lst mpihello
[cisitu02:02897] mca_oob_tcp_create_listen: bind() failed: Permission denied (13)
[cisitu02:02897] mca_oob_tcp_init: unable to create listen socket
[cisitu02:02898] mca_oob_tcp_create_listen: bind() failed: Permission denied (13)
[cisitu02:02898] mca_oob_tcp_init: unable to create listen socket
[cisitu02][0,1,1][btl_tcp_component.c: 412:mca_btl_tcp_component_create_listen] bind() failed with errno=13 [cisitu02][0,1,3][btl_tcp_component.c: 412:mca_btl_tcp_component_create_listen] bind() failed with errno=13 [cisitu02:02897] [0,1,1] ORTE_ERROR_LOG: Not found in file gpr_proxy_deliver_notify_msg.c at line 139 [cisitu02:02898] [0,1,3] ORTE_ERROR_LOG: Not found in file gpr_proxy_deliver_notify_msg.c at line 139
^Cmpirun: killing job...

mpirun noticed that job rank 0 with PID 2976 on node cisitu01 exited on signal 15 (Terminated).
3 additional processes aborted (not shown)

### as root ###
-bash-3.2# mpirun -np 4 -hostfile nodes.lst mpihello
Process 0 on cisitu01 out of 4
Process 1 on cisitu02 out of 4
Process 3 on cisitu02 out of 4
Process 2 on cisitu01 out of 4
-bash-3.2#

thank you in advance,

regards,
-andria
<ompi_info.all>_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

Reply via email to