Our application looks like it does not use mpirun at all. But we have "orterun" so i just tested it by run
orterun --hostfile <hostfile> hostname and it prints out this ... [lynx:21319] [0,0,0] ORTE_ERROR_LOG: Data unpack had inadequate space in file dss/dss_unpack.c at line 90 [lynx:21319] [0,0,0] ORTE_ERROR_LOG: Data unpack had inadequate space in file gpr_replica_cmd_processor.c at line 361 [lynx:21319] [0,0,0] ORTE_ERROR_LOG: Data unpack had inadequate space in file dss/dss_unpack.c at line 90 [lynx:21319] [0,0,0] ORTE_ERROR_LOG: Data unpack had inadequate space in file gpr_replica_cmd_processor.c at line 361 [lynx:21319] [0,0,0] ORTE_ERROR_LOG: Data unpack had inadequate space in file dss/dss_unpack.c at line 90 [lynx:21319] [0,0,0] ORTE_ERROR_LOG: Data unpack had inadequate space in file gpr_replica_cmd_processor.c at line 361 [lynx:21319] [0,0,0] ORTE_ERROR_LOG: Data unpack had inadequate space in file dss/dss_unpack.c at line 90 [lynx:21319] [0,0,0] ORTE_ERROR_LOG: Data unpack had inadequate space in file gpr_replica_cmd_processor.c at line 361 [lynx:21319] [0,0,0] ORTE_ERROR_LOG: Data unpack had inadequate space in file dss/dss_unpack.c at line 90 [lynx:21319] [0,0,0] ORTE_ERROR_LOG: Data unpack had inadequate space in file gpr_replica_cmd_processor.c at line 361 [lynx:21319] [0,0,0] ORTE_ERROR_LOG: Data unpack had inadequate space in file dss/dss_unpack.c at line 90 [lynx:21319] [0,0,0] ORTE_ERROR_LOG: Data unpack had inadequate space in file gpr_replica_cmd_processor.c at line 361 and it just stay/hangs there :( On Nov 29, 2007 6:07 PM, Jeff Squyres <jsquy...@cisco.com> wrote: > On Nov 29, 2007, at 2:09 AM, Madireddy Samuel Vijaykumar wrote: > > > A non MPI application does run without any issues. Could eloberate on > > what you mean by doing mpirun "hostname". You mean i just do an > > 'mpirun lynx' in my case??? > > No, I mean > > mpirun --hostfile <your_hostfile> hostname > > This should run the "hostname" command on each of your nodes. If > running "hostname" doesn't work after changing the order, then > something is very wrong. If it *does* work, it implies something that > there is faulty in the MPI startup (which is more complicated than > starting up non-MPI applications). > > > > > > On Nov 28, 2007 9:57 PM, Jeff Squyres <jsquy...@cisco.com> wrote: > >> Well, that's odd. > >> > >> What happens if you try to mpirun "hostname" (i.e., a non-MPI > >> application)? Does it run, or does it hang? > >> > >> > >> > >> On Nov 23, 2007, at 6:00 AM, Madireddy Samuel Vijaykumar wrote: > >> > >>> I have been using using clusters for some tests. My localhost "lynx" > >>> and i have "puma" and "tiger" which make up the cluster. All have > >>> passwordless ssh enabled. Now if i have the following in my > >>> hostfile(perline in the same order) > >>> > >>> lynx > >>> puma > >>> tiger > >>> > >>> My tests(from lynx) run over the cluster without any issues. > >>> > >>> But if move/remove the lynx from there either (perline in the same > >>> order) > >>> > >>> puma > >>> lynx > >>> tiger > >>> > >>> or > >>> > >>> puma > >>> tiger > >>> > >>> My test(from lynx) just does not get any where. It just hangs. And > >>> does not proceed at all. Is this an issue with way my script handles > >>> the cluster node. Or is there an method for the hostfile. Thanks. > >>> > >>> -- > >>> Sam aka Vijju > >>> :)~ > >>> Linux: Open, True and Cool > >>> _______________________________________________ > >>> users mailing list > >>> us...@open-mpi.org > >>> http://www.open-mpi.org/mailman/listinfo.cgi/users > >> > >> > >> -- > >> Jeff Squyres > >> Cisco Systems > >> > >> _______________________________________________ > >> users mailing list > >> us...@open-mpi.org > >> http://www.open-mpi.org/mailman/listinfo.cgi/users > >> > > > > > > > > -- > > Sam aka Vijju > > :)~ > > Linux: Open, True and Cool > > _______________________________________________ > > users mailing list > > us...@open-mpi.org > > http://www.open-mpi.org/mailman/listinfo.cgi/users > > > -- > Jeff Squyres > Cisco Systems > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users > -- Sam aka Vijju :)~ Linux: Open, True and Cool