Building as root is a bad idea. Try building it as a regular user, using sudo make install if necessary.
Doug Reeder On Mar 28, 2015, at 4:53 PM, LOTFIFAR F. <foad.lotfi...@durham.ac.uk> wrote: > when you said --debug-enable is not activated, I installed it again to make > sure. I have only one mpi installed on all VMs. > > FYI: I have just tried mpich to see how does it works. it freezes for few > minutes then comes back with the error complaining about the firewall!!!! By > the way, I have already have firewall disabled and iptable is set to allow > all connections. I checked with system admin and there is no other firewall > between the nodes. > > here is the output of what you are asked: > > ubuntu@fehg-node-0:~$ which mpirun > /usr/local/openmpi/bin/mpirun > ubuntu@fehg-node-0:~$ ompi_info > Package: Open MPI ubuntu@fehg-node-0 Distribution > Open MPI: 1.6.5 > Open MPI SVN revision: r28673 > Open MPI release date: Jun 26, 2013 > Open RTE: 1.6.5 > Open RTE SVN revision: r28673 > Open RTE release date: Jun 26, 2013 > OPAL: 1.6.5 > OPAL SVN revision: r28673 > OPAL release date: Jun 26, 2013 > MPI API: 2.1 > Ident string: 1.6.5 > Prefix: /usr/local/openmpi > Configured architecture: i686-pc-linux-gnu > Configure host: fehg-node-0 > Configured by: ubuntu > Configured on: Sat Mar 28 20:19:28 UTC 2015 > Configure host: fehg-node-0 > Built by: root > Built on: Sat Mar 28 20:30:18 UTC 2015 > Built host: fehg-node-0 > C bindings: yes > C++ bindings: yes > Fortran77 bindings: no > Fortran90 bindings: no > Fortran90 bindings size: na > C compiler: gcc > C compiler absolute: /usr/bin/gcc > C compiler family name: GNU > C compiler version: 4.6.3 > C++ compiler: g++ > C++ compiler absolute: /usr/bin/g++ > Fortran77 compiler: none > Fortran77 compiler abs: none > Fortran90 compiler: none > Fortran90 compiler abs: none > C profiling: yes > C++ profiling: yes > Fortran77 profiling: no > Fortran90 profiling: no > C++ exceptions: no > Thread support: posix (MPI_THREAD_MULTIPLE: no, progress: no) > Sparse Groups: no > Internal debug support: yes > MPI interface warnings: no > MPI parameter check: runtime > Memory profiling support: no > Memory debugging support: no > libltdl support: yes > Heterogeneous support: no > mpirun default --prefix: no > MPI I/O support: yes > MPI_WTIME support: gettimeofday > Symbol vis. support: yes > Host topology support: yes > MPI extensions: affinity example > FT Checkpoint support: no (checkpoint thread: no) > VampirTrace support: yes > MPI_MAX_PROCESSOR_NAME: 256 > MPI_MAX_ERROR_STRING: 256 > MPI_MAX_OBJECT_NAME: 64 > MPI_MAX_INFO_KEY: 36 > MPI_MAX_INFO_VAL: 256 > MPI_MAX_PORT_NAME: 1024 > MPI_MAX_DATAREP_STRING: 128 > MCA backtrace: execinfo (MCA v2.0, API v2.0, Component v1.6.5) > MCA memory: linux (MCA v2.0, API v2.0, Component v1.6.5) > MCA paffinity: hwloc (MCA v2.0, API v2.0, Component v1.6.5) > MCA carto: auto_detect (MCA v2.0, API v2.0, Component v1.6.5) > MCA carto: file (MCA v2.0, API v2.0, Component v1.6.5) > MCA shmem: mmap (MCA v2.0, API v2.0, Component v1.6.5) > MCA shmem: posix (MCA v2.0, API v2.0, Component v1.6.5) > MCA shmem: sysv (MCA v2.0, API v2.0, Component v1.6.5) > MCA maffinity: first_use (MCA v2.0, API v2.0, Component v1.6.5) > MCA maffinity: hwloc (MCA v2.0, API v2.0, Component v1.6.5) > MCA timer: linux (MCA v2.0, API v2.0, Component v1.6.5) > MCA installdirs: env (MCA v2.0, API v2.0, Component v1.6.5) > MCA installdirs: config (MCA v2.0, API v2.0, Component v1.6.5) > MCA sysinfo: linux (MCA v2.0, API v2.0, Component v1.6.5) > MCA hwloc: hwloc132 (MCA v2.0, API v2.0, Component v1.6.5) > MCA dpm: orte (MCA v2.0, API v2.0, Component v1.6.5) > MCA pubsub: orte (MCA v2.0, API v2.0, Component v1.6.5) > MCA allocator: basic (MCA v2.0, API v2.0, Component v1.6.5) > MCA allocator: bucket (MCA v2.0, API v2.0, Component v1.6.5) > MCA coll: basic (MCA v2.0, API v2.0, Component v1.6.5) > MCA coll: hierarch (MCA v2.0, API v2.0, Component v1.6.5) > MCA coll: inter (MCA v2.0, API v2.0, Component v1.6.5) > MCA coll: self (MCA v2.0, API v2.0, Component v1.6.5) > MCA coll: sm (MCA v2.0, API v2.0, Component v1.6.5) > MCA coll: sync (MCA v2.0, API v2.0, Component v1.6.5) > MCA coll: tuned (MCA v2.0, API v2.0, Component v1.6.5) > MCA io: romio (MCA v2.0, API v2.0, Component v1.6.5) > MCA mpool: fake (MCA v2.0, API v2.0, Component v1.6.5) > MCA mpool: rdma (MCA v2.0, API v2.0, Component v1.6.5) > MCA mpool: sm (MCA v2.0, API v2.0, Component v1.6.5) > MCA pml: bfo (MCA v2.0, API v2.0, Component v1.6.5) > MCA pml: csum (MCA v2.0, API v2.0, Component v1.6.5) > MCA pml: ob1 (MCA v2.0, API v2.0, Component v1.6.5) > MCA pml: v (MCA v2.0, API v2.0, Component v1.6.5) > MCA bml: r2 (MCA v2.0, API v2.0, Component v1.6.5) > MCA rcache: vma (MCA v2.0, API v2.0, Component v1.6.5) > MCA btl: self (MCA v2.0, API v2.0, Component v1.6.5) > MCA btl: sm (MCA v2.0, API v2.0, Component v1.6.5) > MCA btl: tcp (MCA v2.0, API v2.0, Component v1.6.5) > MCA topo: unity (MCA v2.0, API v2.0, Component v1.6.5) > MCA osc: pt2pt (MCA v2.0, API v2.0, Component v1.6.5) > MCA osc: rdma (MCA v2.0, API v2.0, Component v1.6.5) > MCA iof: hnp (MCA v2.0, API v2.0, Component v1.6.5) > MCA iof: orted (MCA v2.0, API v2.0, Component v1.6.5) > MCA iof: tool (MCA v2.0, API v2.0, Component v1.6.5) > MCA oob: tcp (MCA v2.0, API v2.0, Component v1.6.5) > MCA odls: default (MCA v2.0, API v2.0, Component v1.6.5) > MCA ras: cm (MCA v2.0, API v2.0, Component v1.6.5) > MCA ras: loadleveler (MCA v2.0, API v2.0, Component v1.6.5) > MCA ras: slurm (MCA v2.0, API v2.0, Component v1.6.5) > MCA rmaps: load_balance (MCA v2.0, API v2.0, Component v1.6.5) > MCA rmaps: rank_file (MCA v2.0, API v2.0, Component v1.6.5) > MCA rmaps: resilient (MCA v2.0, API v2.0, Component v1.6.5) > MCA rmaps: round_robin (MCA v2.0, API v2.0, Component v1.6.5) > MCA rmaps: seq (MCA v2.0, API v2.0, Component v1.6.5) > MCA rmaps: topo (MCA v2.0, API v2.0, Component v1.6.5) > MCA rml: oob (MCA v2.0, API v2.0, Component v1.6.5) > MCA routed: binomial (MCA v2.0, API v2.0, Component v1.6.5) > MCA routed: cm (MCA v2.0, API v2.0, Component v1.6.5) > MCA routed: direct (MCA v2.0, API v2.0, Component v1.6.5) > MCA routed: linear (MCA v2.0, API v2.0, Component v1.6.5) > MCA routed: radix (MCA v2.0, API v2.0, Component v1.6.5) > MCA routed: slave (MCA v2.0, API v2.0, Component v1.6.5) > MCA plm: rsh (MCA v2.0, API v2.0, Component v1.6.5) > MCA plm: slurm (MCA v2.0, API v2.0, Component v1.6.5) > MCA filem: rsh (MCA v2.0, API v2.0, Component v1.6.5) > MCA errmgr: default (MCA v2.0, API v2.0, Component v1.6.5) > MCA ess: env (MCA v2.0, API v2.0, Component v1.6.5) > MCA ess: hnp (MCA v2.0, API v2.0, Component v1.6.5) > MCA ess: singleton (MCA v2.0, API v2.0, Component v1.6.5) > MCA ess: slave (MCA v2.0, API v2.0, Component v1.6.5) > MCA ess: slurm (MCA v2.0, API v2.0, Component v1.6.5) > MCA ess: slurmd (MCA v2.0, API v2.0, Component v1.6.5) > MCA ess: tool (MCA v2.0, API v2.0, Component v1.6.5) > MCA grpcomm: bad (MCA v2.0, API v2.0, Component v1.6.5) > MCA grpcomm: basic (MCA v2.0, API v2.0, Component v1.6.5) > MCA grpcomm: hier (MCA v2.0, API v2.0, Component v1.6.5) > MCA notifier: command (MCA v2.0, API v1.0, Component v1.6.5) > MCA notifier: syslog (MCA v2.0, API v1.0, Component v1.6.5) > > > Regards, > Karos > > > > From: users [users-boun...@open-mpi.org] on behalf of Ralph Castain > [r...@open-mpi.org] > Sent: 28 March 2015 22:04 > To: Open MPI Users > Subject: Re: [OMPI users] Connection problem on Linux cluster > > Something is clearly wrong. Most likely, you are not pointing at the OMPI > install that you think you are - or you didn’t really configure it properly. > Check the path by running “which mpirun” and ensure you are executing the one > you expected. If so, then run “ompi_info” to see how it was configured and > sent it to us. > > >> On Mar 28, 2015, at 1:36 PM, LOTFIFAR F. <foad.lotfi...@durham.ac.uk> wrote: >> >> surprisingly, it is all that I get!! nothing else come after. This is the >> same for openmpi-1.6.5. >> >> >> From: users [users-boun...@open-mpi.org] on behalf of Ralph Castain >> [r...@open-mpi.org] >> Sent: 28 March 2015 20:12 >> To: Open MPI Users >> Subject: Re: [OMPI users] Connection problem on Linux cluster >> >> Did you configure —enable-debug? We aren’t seeing any of the debug output, >> so I suspect not. >> >> >>> On Mar 28, 2015, at 12:56 PM, LOTFIFAR F. <foad.lotfi...@durham.ac.uk> >>> wrote: >>> >>> I have done it and it is the results: >>> >>> ubuntu@fehg-node-0:~$ mpirun -host fehg-node-7 -mca oob_base_verbose 100 >>> -mca state_base_verbose 10 hostname >>> [fehg-node-0:30034] mca: base: components_open: Looking for oob components >>> [fehg-node-0:30034] mca: base: components_open: opening oob components >>> [fehg-node-0:30034] mca: base: components_open: found loaded component tcp >>> [fehg-node-0:30034] mca: base: components_open: component tcp register >>> function successful >>> [fehg-node-0:30034] mca: base: components_open: component tcp open function >>> successful >>> [fehg-node-7:31138] mca: base: components_open: Looking for oob components >>> [fehg-node-7:31138] mca: base: components_open: opening oob components >>> [fehg-node-7:31138] mca: base: components_open: found loaded component tcp >>> [fehg-node-7:31138] mca: base: components_open: component tcp register >>> function successful >>> [fehg-node-7:31138] mca: base: components_open: component tcp open function >>> successful >>> >>> freeze ... >>> >>> Regards >>> >>> From: users [users-boun...@open-mpi.org] on behalf of LOTFIFAR F. >>> [foad.lotfi...@durham.ac.uk] >>> Sent: 28 March 2015 18:49 >>> To: Open MPI Users >>> Subject: Re: [OMPI users] Connection problem on Linux cluster >>> >>> fehg_node_1 and fehg-node-7 are the same. it is just a typo. >>> >>> Correction: VM names are fehg-node-0 and fehg-node-7. >>> >>> >>> Regards, >>> >>> From: users [users-boun...@open-mpi.org] on behalf of Ralph Castain >>> [r...@open-mpi.org] >>> Sent: 28 March 2015 18:23 >>> To: Open MPI Users >>> Subject: Re: [OMPI users] Connection problem on Linux cluster >>> >>> Just to be clear: do you have two physical nodes? Or just one physical node >>> and you are running two VMs on it? >>> >>>> On Mar 28, 2015, at 10:51 AM, LOTFIFAR F. <foad.lotfi...@durham.ac.uk> >>>> wrote: >>>> >>>> I have a floating IP for accessing nodes from outside of the cluster and >>>> internal ip addresses. I tried to run the jobs with both of them (both ip >>>> addresses) but it makes no difference. >>>> I have just installed openmpi 1.6.5 to see how does this version works. In >>>> this case I get nothing and I have to press Crtl+c. not output or error is >>>> shown. >>>> >>>> >>>> From: users [users-boun...@open-mpi.org] on behalf of Ralph Castain >>>> [r...@open-mpi.org] >>>> Sent: 28 March 2015 17:03 >>>> To: Open MPI Users >>>> Subject: Re: [OMPI users] Connection problem on Linux cluster >>>> >>>> You mentioned running this in a VM - is that IP address correct for >>>> getting across the VMs? >>>> >>>> >>>>> On Mar 28, 2015, at 8:38 AM, LOTFIFAR F. <foad.lotfi...@durham.ac.uk> >>>>> wrote: >>>>> >>>>> Hi , >>>>> >>>>> I am wondering how can I solve this problem. >>>>> System Spec: >>>>> 1- Linux cluster with two nodes (master and slave) with Ubuntu 12.04 LTS >>>>> 32bit. >>>>> 2- openmpi 1.8.4 >>>>> >>>>> I do a simple test running on fehg_node_0: >>>>> > mpirun -host fehg_node_0,fehg_node_1 hello_world -mca oob_base_verbose >>>>> > 20 >>>>> >>>>> and I get the following error: >>>>> >>>>> A process or daemon was unable to complete a TCP connection >>>>> to another process: >>>>> Local host: fehg-node-0 >>>>> Remote host: 10.104.5.40 >>>>> This is usually caused by a firewall on the remote host. Please >>>>> check that any firewall (e.g., iptables) has been disabled and >>>>> try again. >>>>> ------------------------------------------------------------ >>>>> -------------------------------------------------------------------------- >>>>> ORTE was unable to reliably start one or more daemons. >>>>> This usually is caused by: >>>>> >>>>> * not finding the required libraries and/or binaries on >>>>> one or more nodes. Please check your PATH and LD_LIBRARY_PATH >>>>> settings, or configure OMPI with --enable-orterun-prefix-by-default >>>>> >>>>> * lack of authority to execute on one or more specified nodes. >>>>> Please verify your allocation and authorities. >>>>> >>>>> * the inability to write startup files into /tmp >>>>> (--tmpdir/orte_tmpdir_base). >>>>> Please check with your sys admin to determine the correct location to >>>>> use. >>>>> >>>>> * compilation of the orted with dynamic libraries when static are >>>>> required >>>>> (e.g., on Cray). Please check your configure cmd line and consider using >>>>> one of the contrib/platform definitions for your system type. >>>>> >>>>> * an inability to create a connection back to mpirun due to a >>>>> lack of common network interfaces and/or no route found between >>>>> them. Please check network connectivity (including firewalls >>>>> and network routing requirements). >>>>> >>>>> Verbose: >>>>> 1- I have full access to the VMs on the cluster and setup everything >>>>> myself >>>>> 2- Firewall and iptables are all disabled on the nodes >>>>> 3- nodes can ssh to each other with no problem >>>>> 4- non-interactive bash calls works fine i.e. when I run ssh othernode >>>>> env | grep PATH from both nodes, both PATH and LD_LIBRARY_PATH are set >>>>> correctly >>>>> 5- I have checked the posts, a similar problem reported for Solaris but I >>>>> could not find a clue about mine. >>>>> 6- run with --enable-orterun-prefix-by-default does not make any changes. >>>>> 7- I see orte is running on the other node when I check processes, but >>>>> nothing happens after that and the error happens. >>>>> >>>>> Regards, >>>>> Karos >>>>> _______________________________________________ >>>>> users mailing list >>>>> us...@open-mpi.org >>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >>>>> Link to this post: >>>>> http://www.open-mpi.org/community/lists/users/2015/03/26555.php >>>> >>>> _______________________________________________ >>>> users mailing list >>>> us...@open-mpi.org >>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >>>> Link to this post: >>>> http://www.open-mpi.org/community/lists/users/2015/03/26557.php >>> >>> _______________________________________________ >>> users mailing list >>> us...@open-mpi.org >>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >>> Link to this post: >>> http://www.open-mpi.org/community/lists/users/2015/03/26562.php >> >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >> Link to this post: >> http://www.open-mpi.org/community/lists/users/2015/03/26564.php > > _______________________________________________ > users mailing list > us...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > Link to this post: > http://www.open-mpi.org/community/lists/users/2015/03/26566.php