I've benn investigating and there is no firewall that could stop TCP traffic in the cluster. With the option --mca plm_base_verbose 30 I get the following output:
[itanium1] /home/otro > mpirun --mca plm_base_verbose 30 --host itanium2 helloworld.out [itanium1:08311] mca: base: components_open: Looking for plm components [itanium1:08311] mca: base: components_open: opening plm components [itanium1:08311] mca: base: components_open: found loaded component rsh [itanium1:08311] mca: base: components_open: component rsh has no register function [itanium1:08311] mca: base: components_open: component rsh open function successful [itanium1:08311] mca: base: components_open: found loaded component slurm [itanium1:08311] mca: base: components_open: component slurm has no register function [itanium1:08311] mca: base: components_open: component slurm open function successful [itanium1:08311] mca:base:select: Auto-selecting plm components [itanium1:08311] mca:base:select:( plm) Querying component [rsh] [itanium1:08311] mca:base:select:( plm) Query of component [rsh] set priority to 10 [itanium1:08311] mca:base:select:( plm) Querying component [slurm] [itanium1:08311] mca:base:select:( plm) Skipping component [slurm]. Query failed to return a module [itanium1:08311] mca:base:select:( plm) Selected component [rsh] [itanium1:08311] mca: base: close: component slurm closed [itanium1:08311] mca: base: close: unloading component slurm --Hangs here It seems a slurm problem?? Thanks to any idea El Vie, 19 de Marzo de 2010, 17:57, Ralph Castain escribió: > Did you configure OMPI with --enable-debug? You should do this so that > more diagnostic output is available. > > You can also add the following to your cmd line to get more info: > > --debug --debug-daemons --leave-session-attached > > Something is likely blocking proper launch of the daemons and processes so > you aren't getting to the btl's at all. > > > On Mar 19, 2010, at 9:42 AM, uriz.49...@e.unavarra.es wrote: > >> The processes are running on the remote nodes but they don't give the >> response to the origin node. I don't know why. >> With the option --mca btl_base_verbose 30, I have the same problems and >> it >> doesn't show any message. >> >> Thanks >> >>> On Wed, Mar 17, 2010 at 1:41 PM, Jeff Squyres <jsquy...@cisco.com> >>> wrote: >>>> On Mar 17, 2010, at 4:39 AM, <uriz.49...@e.unavarra.es> wrote: >>>> >>>>> Hi everyone I'm a new Open MPI user and I have just installed Open >>>>> MPI >>>>> in >>>>> a 6 nodes cluster with Scientific Linux. When I execute it in local >>>>> it >>>>> works perfectly, but when I try to execute it on the remote nodes >>>>> with >>>>> the >>>>> --host option it hangs and gives no message. I think that the >>>>> problem >>>>> could be with the shared libraries but i'm not sure. In my opinion >>>>> the >>>>> problem is not ssh because i can access to the nodes with no password >>>> >>>> You might want to check that Open MPI processes are actually running >>>> on >>>> the remote nodes -- check with ps if you see any "orted" or other >>>> MPI-related processes (e.g., your processes). >>>> >>>> Do you have any TCP firewall software running between the nodes? If >>>> so, >>>> you'll need to disable it (at least for Open MPI jobs). >>> >>> I also recommend running mpirun with the option --mca btl_base_verbose >>> 30 to troubleshoot tcp issues. >>> >>> In some environments, you need to explicitly tell mpirun what network >>> interfaces it can use to reach the hosts. Read the following FAQ >>> section for more information: >>> >>> http://www.open-mpi.org/faq/?category=tcp >>> >>> Item 7 of the FAQ might be of special interest. >>> >>> Regards, >>> >>> _______________________________________________ >>> users mailing list >>> us...@open-mpi.org >>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>> >> >> >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users > > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users >