FWIW: I'm working on a rewrite of our out-of-band comm system (it does the 
wireup that is hanging on your system) that will include a shared memory 
module. Once that is in place, this problem will go away when running on a 
single node (still need sockets for multi-node, of course).


On Apr 11, 2013, at 6:32 AM, Rodrigo Gómez Vázquez <rodrigo...@gmail.com> wrote:

> You were right, Ralph. I made a short test turning off the firewall and MPI 
> ran as predicted. I am taking a look to the firewall rules, to figure out how 
> to set it up properly, so that it does not interfere with OpenMPI's 
> functionalities. I will post the required changes in those settings as soon 
> as I find them out, just in case anyone needs that in the future.
> Thanks again!
> Rodrigo
> 
> On 04/10/2013 10:26 PM, Rodrigo Gómez Vázquez wrote:
>> In fact we should have restrictive firewall settings, as long as I remember. 
>> I will check the rules again tomorrow morning. That's very interesting, I 
>> would expect such kind of problem if I were working with a cluster, but I 
>> haven't thought that it might lead also to problems for the internal 
>> communication in the machine.
>> 
>> Thanks, Ralph. I'll let you know if this was the actual reason of the 
>> problem.
>> Rodrigo
>> 
>> On 04/10/2013 09:46 PM, Ralph Castain wrote:
>>> Best guess is that there is some issue with getting TCP sockets on the 
>>> system - once the procs are launched, they need to open a TCP socket and 
>>> communicate back to mpirun. If the socket is "stuck" waiting to complete 
>>> the open, things will hang.
>>> 
>>> You might check to ensure there isn't some security setting in place that 
>>> protects sockets - something like iptables, for example.
>>> 
>>> 
>>> On Apr 10, 2013, at 11:57 AM, Rodrigo Gómez Vázquez <rodrigo...@gmail.com> 
>>> wrote:
>>> 
>>>> Hi,
>>>> 
>>>> I am having troubles with the program in a simulation server.
>>>> The system consists of several processors but all in the same node (more 
>>>> information of the specs. is in the attachments).
>>>> The system is quite new (few months) and a user reported me that it was 
>>>> not possible to run simulations on multiple processors in parallel.
>>>> We are using it for CFD-Simulations with OpenFOAM, which comes along with 
>>>> an own 1.5.3-version of OpenMPI (for more details you can look inside the 
>>>> "ThirdParty software folder" following this link: 
>>>> http://www.openfoam.org/archive/2.1.1/download/source.php). The OS is an 
>>>> Ubuntu 12.04 Server distro (see uname.out in the attachments).
>>>> He tried to start a simulation in parallel using the following command:
>>>> 
>>>> ~: mpirun -np 4 <solver-with-its-corresponding-parameters>
>>>> 
>>>> As a result the simulation does not start and there is no error message. 
>>>> It looks like the program is just waiting/looking for something. We can 
>>>> see shortly the 4 processes with their PIDs in the "top" processes list, 
>>>> but only for few tenths of second and with 0% use of CPU and 0.0% use of 
>>>> memory as well. In order to recover the command terminal we have to kill 
>>>> the process.
>>>> 
>>>> The same happens with the "hello" scripts that come along with the 
>>>> OpenMPI's sources:
>>>> 
>>>> :~$mpicc hello_c.c -o hello
>>>> :~$mpirun -np 4 hello
>>>> ... and here it hangs again.
>>>> 
>>>> I tried to execute other simpler processes, as recommended to check the 
>>>> installation. Let's see:
>>>> 
>>>> :~$mpirun -np 4 hostname
>>>> simserver
>>>> simserver
>>>> simserver
>>>> simserver
>>>> :~$
>>>> 
>>>> Works, as well as "ompi_info" does.
>>>> 
>>>> Since we use the same OpenFOAM version without problems in several 
>>>> computers over ubuntu-based distros, I supposed that there must be any 
>>>> kind of incompatibility problem, due to the hardware, but...
>>>> 
>>>> Anyway, I repeated the tests with the OpenMPI version from the ubuntu 
>>>> repositories (1.4.3) and got the same result.
>>>> 
>>>> It would be wonderful if anyone could give me a hint.
>>>> 
>>>> I am afraid, it may result a complicated issue, so please, let me know 
>>>> whatever relevant information missing.
>>>> 
>>>> Thanks in advance, guys
>>>> 
>>>> Rodrigo (Europe, GMT+2:00)
>>>> <openmpi1.4.3_ompi_info.out.bz2><uname.out><cat_-proc-cpuinfo.out.bz2>_______________________________________________
>>>>  
>>>> users mailing list
>>>> us...@open-mpi.org
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>> 
>>> _______________________________________________
>>> users mailing list
>>> us...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>> 
>> 
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


Reply via email to