FWIW: I'm working on a rewrite of our out-of-band comm system (it does the wireup that is hanging on your system) that will include a shared memory module. Once that is in place, this problem will go away when running on a single node (still need sockets for multi-node, of course).
On Apr 11, 2013, at 6:32 AM, Rodrigo Gómez Vázquez <rodrigo...@gmail.com> wrote: > You were right, Ralph. I made a short test turning off the firewall and MPI > ran as predicted. I am taking a look to the firewall rules, to figure out how > to set it up properly, so that it does not interfere with OpenMPI's > functionalities. I will post the required changes in those settings as soon > as I find them out, just in case anyone needs that in the future. > Thanks again! > Rodrigo > > On 04/10/2013 10:26 PM, Rodrigo Gómez Vázquez wrote: >> In fact we should have restrictive firewall settings, as long as I remember. >> I will check the rules again tomorrow morning. That's very interesting, I >> would expect such kind of problem if I were working with a cluster, but I >> haven't thought that it might lead also to problems for the internal >> communication in the machine. >> >> Thanks, Ralph. I'll let you know if this was the actual reason of the >> problem. >> Rodrigo >> >> On 04/10/2013 09:46 PM, Ralph Castain wrote: >>> Best guess is that there is some issue with getting TCP sockets on the >>> system - once the procs are launched, they need to open a TCP socket and >>> communicate back to mpirun. If the socket is "stuck" waiting to complete >>> the open, things will hang. >>> >>> You might check to ensure there isn't some security setting in place that >>> protects sockets - something like iptables, for example. >>> >>> >>> On Apr 10, 2013, at 11:57 AM, Rodrigo Gómez Vázquez <rodrigo...@gmail.com> >>> wrote: >>> >>>> Hi, >>>> >>>> I am having troubles with the program in a simulation server. >>>> The system consists of several processors but all in the same node (more >>>> information of the specs. is in the attachments). >>>> The system is quite new (few months) and a user reported me that it was >>>> not possible to run simulations on multiple processors in parallel. >>>> We are using it for CFD-Simulations with OpenFOAM, which comes along with >>>> an own 1.5.3-version of OpenMPI (for more details you can look inside the >>>> "ThirdParty software folder" following this link: >>>> http://www.openfoam.org/archive/2.1.1/download/source.php). The OS is an >>>> Ubuntu 12.04 Server distro (see uname.out in the attachments). >>>> He tried to start a simulation in parallel using the following command: >>>> >>>> ~: mpirun -np 4 <solver-with-its-corresponding-parameters> >>>> >>>> As a result the simulation does not start and there is no error message. >>>> It looks like the program is just waiting/looking for something. We can >>>> see shortly the 4 processes with their PIDs in the "top" processes list, >>>> but only for few tenths of second and with 0% use of CPU and 0.0% use of >>>> memory as well. In order to recover the command terminal we have to kill >>>> the process. >>>> >>>> The same happens with the "hello" scripts that come along with the >>>> OpenMPI's sources: >>>> >>>> :~$mpicc hello_c.c -o hello >>>> :~$mpirun -np 4 hello >>>> ... and here it hangs again. >>>> >>>> I tried to execute other simpler processes, as recommended to check the >>>> installation. Let's see: >>>> >>>> :~$mpirun -np 4 hostname >>>> simserver >>>> simserver >>>> simserver >>>> simserver >>>> :~$ >>>> >>>> Works, as well as "ompi_info" does. >>>> >>>> Since we use the same OpenFOAM version without problems in several >>>> computers over ubuntu-based distros, I supposed that there must be any >>>> kind of incompatibility problem, due to the hardware, but... >>>> >>>> Anyway, I repeated the tests with the OpenMPI version from the ubuntu >>>> repositories (1.4.3) and got the same result. >>>> >>>> It would be wonderful if anyone could give me a hint. >>>> >>>> I am afraid, it may result a complicated issue, so please, let me know >>>> whatever relevant information missing. >>>> >>>> Thanks in advance, guys >>>> >>>> Rodrigo (Europe, GMT+2:00) >>>> <openmpi1.4.3_ompi_info.out.bz2><uname.out><cat_-proc-cpuinfo.out.bz2>_______________________________________________ >>>> >>>> users mailing list >>>> us...@open-mpi.org >>>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>> >>> _______________________________________________ >>> users mailing list >>> us...@open-mpi.org >>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>> >> > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users