[OMPI users] OPENMPI_ORTE_LOG_ERROR
Hi, I'm trying to run my mpi program using open mpi 1.7 rc5 on 4 machines using the command: mpirun -np4 -hostfile file a.out but i get the following message errors: ORTE_ERROR_LOG: A message is attempting to be sent to a process whose contact information is unknown in file ../../../../../ompi/orte/mca/rml/oob/rml_oob_send.c attempted to send to [[21341,0],2]: tag 15 ORTE_ERROR_LOG: A message is attempting to be sent to a process whose contact information is unknown in file ../../../../ompi/orte/mca/grpcomm/base/grpcomm_base_xcast.c The file etc/hosts is composed by ipaddress hostname, I have exchange ssh keys among the machines and ssh login works without requiring authentication password. Surprisingly if I try to run my program with at most 2 hosts, and so the file hosts contains only two hosts, it works but if i try to run my program with more than two hosts i have this error; mpi works well on each machine and I also tried to run my program with different couple of machines in order to be sure that no machine could be the problem. Can you help me please? Ada
Re: [OMPI users] OPENMPI_ORTE_LOG_ERROR
My problem is that I have to use openmpi 1.7 rc5 because I'm using the Java binding mpijava... Is it present in the latest snapshot you told me? If so where can I find it? Thanks a lot Ada Il giorno 22/gen/2013 21:03, "Ralph Castain" ha scritto: > It seems to be working fine for me with the latest 1.7 tarball (not rc5 - > I didn't test that one). Could be there was a problem that has since been > fixed. We are getting ready to release an updated rc, so you might want to > try it (or use the latest nightly 1.7 snapshot). > > > On Jan 22, 2013, at 9:57 AM, Ada Mancuso wrote: > > Hi, > I'm trying to run my mpi program using open mpi 1.7 rc5 on 4 machines > using the command: > mpirun -np4 -hostfile file a.out > but i get the following message errors: > ORTE_ERROR_LOG: A message is attempting to be sent to a process whose > contact information is unknown in file > ../../../../../ompi/orte/mca/rml/oob/rml_oob_send.c > attempted to send to [[21341,0],2]: tag 15 > ORTE_ERROR_LOG: A message is attempting to be sent to a process whose > contact information is unknown in file > ../../../../ompi/orte/mca/grpcomm/base/grpcomm_base_xcast.c > The file etc/hosts is composed by ipaddress hostname, I have exchange ssh > keys among the machines and ssh login works without requiring > authentication password. Surprisingly if I try to run my program with at > most 2 hosts, and so the file hosts contains only two hosts, it works but > if i try to run my program with more than two hosts i have this error; mpi > works well on each machine and I also tried to run my program with > different couple of machines in order to be sure that no machine could be > the problem. > Can you help me please? > Ada > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users > > > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users >
Re: [OMPI users] OPENMPI_ORTE_LOG_ERROR
Thanks a lot I will try it. Il giorno 22/gen/2013 21:49, "Ralph Castain" ha scritto: > Ouch - no, you'd have to take it from the developer's trunk, either via > svn checkout or the nightly developer's snapshot > > On Jan 22, 2013, at 12:35 PM, Ada Mancuso wrote: > > My problem is that I have to use openmpi 1.7 rc5 because I'm using the > Java binding mpijava... Is it present in the latest snapshot you told me? > If so where can I find it? > Thanks a lot > Ada > Il giorno 22/gen/2013 21:03, "Ralph Castain" ha > scritto: > >> It seems to be working fine for me with the latest 1.7 tarball (not rc5 - >> I didn't test that one). Could be there was a problem that has since been >> fixed. We are getting ready to release an updated rc, so you might want to >> try it (or use the latest nightly 1.7 snapshot). >> >> >> On Jan 22, 2013, at 9:57 AM, Ada Mancuso wrote: >> >> Hi, >> I'm trying to run my mpi program using open mpi 1.7 rc5 on 4 machines >> using the command: >> mpirun -np4 -hostfile file a.out >> but i get the following message errors: >> ORTE_ERROR_LOG: A message is attempting to be sent to a process whose >> contact information is unknown in file >> ../../../../../ompi/orte/mca/rml/oob/rml_oob_send.c >> attempted to send to [[21341,0],2]: tag 15 >> ORTE_ERROR_LOG: A message is attempting to be sent to a process whose >> contact information is unknown in file >> ../../../../ompi/orte/mca/grpcomm/base/grpcomm_base_xcast.c >> The file etc/hosts is composed by ipaddress hostname, I have exchange ssh >> keys among the machines and ssh login works without requiring >> authentication password. Surprisingly if I try to run my program with at >> most 2 hosts, and so the file hosts contains only two hosts, it works but >> if i try to run my program with more than two hosts i have this error; mpi >> works well on each machine and I also tried to run my program with >> different couple of machines in order to be sure that no machine could be >> the problem. >> Can you help me please? >> Ada >> ___ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users >> >> >> >> ___ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users >> > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users > > > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users >
Re: [OMPI users] OPENMPI_ORTE_LOG_ERROR
Hi, I've installed the latest snapshot taken from svn developer's trunk but I had the same problems. This is my configuration: - Ubuntu 2.6.38-8 kernel - Openssh_5.8p1 openssl 0.9.8o - Libtool version 2.4 - Open mpi 1.7 rc5 and latest snapshots. Do you think my problem could be related with the operating system used or with any parameter or configuration? I've also checked the ssh log file but I didn't find any problem. Thanks in advance Ada Il giorno martedì 22 gennaio 2013, Ralph Castain ha scritto: > > Ouch - no, you'd have to take it from the developer's trunk, either via svn checkout or the nightly developer's snapshot > > On Jan 22, 2013, at 12:35 PM, Ada Mancuso wrote: > > My problem is that I have to use openmpi 1.7 rc5 because I'm using the Java binding mpijava... Is it present in the latest snapshot you told me? If so where can I find it? > Thanks a lot > Ada > > Il giorno 22/gen/2013 21:03, "Ralph Castain" ha scritto: >> >> It seems to be working fine for me with the latest 1.7 tarball (not rc5 - I didn't test that one). Could be there was a problem that has since been fixed. We are getting ready to release an updated rc, so you might want to try it (or use the latest nightly 1.7 snapshot). >> >> >> On Jan 22, 2013, at 9:57 AM, Ada Mancuso wrote: >> >> Hi, >> I'm trying to run my mpi program using open mpi 1.7 rc5 on 4 machines using the command: >> mpirun -np4 -hostfile file a.out >> but i get the following message errors: >> ORTE_ERROR_LOG: A message is attempting to be sent to a process whose contact information is unknown in file ../../../../../ompi/orte/mca/rml/oob/rml_oob_send.c >> attempted to send to [[21341,0],2]: tag 15 >> ORTE_ERROR_LOG: A message is attempting to be sent to a process whose contact information is unknown in file ../../../../ompi/orte/mca/grpcomm/base/grpcomm_base_xcast.c >> The file etc/hosts is composed by ipaddress hostname, I have exchange ssh keys among the machines and ssh login works without requiring authentication password. Surprisingly if I try to run my program with at most 2 hosts, and so the file hosts contains only two hosts, it works but if i try to run my program with more than two hosts i have this error; mpi works well on each machine and I also tried to run my program with different couple of machines in order to be sure that no machine could be the problem. >> Can you help me please? >> Ada >> ___ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users >> >> >> >> ___ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users > >
Re: [OMPI users] OPENMPI_ORTE_LOG_ERROR
I'm sure that openmpi works, morever my problem happens only with more than 2 slaves (on different machines while in local it greatly works with any number of slaves). Thanks Ada Il giorno 23/gen/2013 14:04, "Jeff Squyres (jsquyres)" ha scritto: > Are you able to run the C examples in the examples/ directory from the > tarball? > > Our README suggests the following: > > - > When verifying a new Open MPI installation, we recommend running three > tests: > > 1. Use "mpirun" to launch a non-MPI program (e.g., hostname or uptime) >across multiple nodes. > > 2. Use "mpirun" to launch a trivial MPI program that does no MPI >communication (e.g., the hello_c program in the examples/ directory >in the Open MPI distribution). > > 3. Use "mpirun" to launch a trivial MPI program that sends and >receives a few MPI messages (e.g., the ring_c program in the >examples/ directory in the Open MPI distribution). > > If you can run all three of these tests successfully, that is a good > indication that Open MPI built and installed properly. > - > > > On Jan 23, 2013, at 7:41 AM, Ada Mancuso > wrote: > > > Hi, > > I've installed the latest snapshot taken from svn developer's trunk but > I had the same problems. This is my configuration: > > • Ubuntu 2.6.38-8 kernel > > • Openssh_5.8p1 openssl 0.9.8o > > • Libtool version 2.4 > > • Open mpi 1.7 rc5 and latest snapshots. > > Do you think my problem could be related with the operating system used > or with any parameter or configuration? I've also checked the ssh log file > but I didn't find any problem. > > Thanks in advance > > Ada > > > > > > > > Il giorno martedì 22 gennaio 2013, Ralph Castain ha scritto: > > > > > > Ouch - no, you'd have to take it from the developer's trunk, either > via svn checkout or the nightly developer's snapshot > > > > > > On Jan 22, 2013, at 12:35 PM, Ada Mancuso > wrote: > > > > > > My problem is that I have to use openmpi 1.7 rc5 because I'm using the > Java binding mpijava... Is it present in the latest snapshot you told me? > If so where can I find it? > > > Thanks a lot > > > Ada > > > > > > Il giorno 22/gen/2013 21:03, "Ralph Castain" ha > scritto: > > >> > > >> It seems to be working fine for me with the latest 1.7 tarball (not > rc5 - I didn't test that one). Could be there was a problem that has since > been fixed. We are getting ready to release an updated rc, so you might > want to try it (or use the latest nightly 1.7 snapshot). > > >> > > >> > > >> On Jan 22, 2013, at 9:57 AM, Ada Mancuso > wrote: > > >> > > >> Hi, > > >> I'm trying to run my mpi program using open mpi 1.7 rc5 on 4 machines > using the command: > > >> mpirun -np4 -hostfile file a.out > > >> but i get the following message errors: > > >> ORTE_ERROR_LOG: A message is attempting to be sent to a process whose > contact information is unknown in file > ../../../../../ompi/orte/mca/rml/oob/rml_oob_send.c > > >> attempted to send to [[21341,0],2]: tag 15 > > >> ORTE_ERROR_LOG: A message is attempting to be sent to a process whose > contact information is unknown in file > ../../../../ompi/orte/mca/grpcomm/base/grpcomm_base_xcast.c > > >> The file etc/hosts is composed by ipaddress hostname, I have exchange > ssh keys among the machines and ssh login works without requiring > authentication password. Surprisingly if I try to run my program with at > most 2 hosts, and so the file hosts contains only two hosts, it works but > if i try to run my program with more than two hosts i have this error; mpi > works well on each machine and I also tried to run my program with > different couple of machines in order to be sure that no machine could be > the problem. > > >> Can you help me please? > > >> Ada > > >> ___ > > >> users mailing list > > >> us...@open-mpi.org > > >> http://www.open-mpi.org/mailman/listinfo.cgi/users > > >> > > >> > > >> > > >> ___ > > >> users mailing list > > >> us...@open-mpi.org > > >> http://www.open-mpi.org/mailman/listinfo.cgi/users > > > > > > ___ > > > users mailing list > > > us...@open-mpi.org > > > http://www.open-mpi.org/mailman/listinfo.cgi/users > > > > > > > > ___ > > users mailing list > > us...@open-mpi.org > > http://www.open-mpi.org/mailman/listinfo.cgi/users > > > -- > Jeff Squyres > jsquy...@cisco.com > For corporate legal information go to: > http://www.cisco.com/web/about/doing_business/legal/cri/ > > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users >
Re: [OMPI users] OPENMPI_ORTE_LOG_ERROR
Yes I can but with at most two machines as slave and one machine as master, If I try to add another one as slave I get those errors. Il giorno 23/gen/2013 14:38, "Jeff Squyres (jsquyres)" ha scritto: > I'm not sure I understand you. Does Open MPI work across multiple > machines? I.e., can you do all three of those steps across multiple > machines? > > On Jan 23, 2013, at 8:16 AM, Ada Mancuso > wrote: > > > I'm sure that openmpi works, morever my problem happens only with more > than 2 slaves (on different machines while in local it greatly works with > any number of slaves). > > Thanks > > Ada > > > > Il giorno 23/gen/2013 14:04, "Jeff Squyres (jsquyres)" < > jsquy...@cisco.com> ha scritto: > > Are you able to run the C examples in the examples/ directory from the > tarball? > > > > Our README suggests the following: > > > > - > > When verifying a new Open MPI installation, we recommend running three > > tests: > > > > 1. Use "mpirun" to launch a non-MPI program (e.g., hostname or uptime) > >across multiple nodes. > > > > 2. Use "mpirun" to launch a trivial MPI program that does no MPI > >communication (e.g., the hello_c program in the examples/ directory > >in the Open MPI distribution). > > > > 3. Use "mpirun" to launch a trivial MPI program that sends and > >receives a few MPI messages (e.g., the ring_c program in the > >examples/ directory in the Open MPI distribution). > > > > If you can run all three of these tests successfully, that is a good > > indication that Open MPI built and installed properly. > > - > > > > > > On Jan 23, 2013, at 7:41 AM, Ada Mancuso > > wrote: > > > > > Hi, > > > I've installed the latest snapshot taken from svn developer's trunk > but I had the same problems. This is my configuration: > > > • Ubuntu 2.6.38-8 kernel > > > • Openssh_5.8p1 openssl 0.9.8o > > > • Libtool version 2.4 > > > • Open mpi 1.7 rc5 and latest snapshots. > > > Do you think my problem could be related with the operating system > used or with any parameter or configuration? I've also checked the ssh log > file but I didn't find any problem. > > > Thanks in advance > > > Ada > > > > > > > > > > > > Il giorno martedì 22 gennaio 2013, Ralph Castain ha scritto: > > > > > > > > Ouch - no, you'd have to take it from the developer's trunk, either > via svn checkout or the nightly developer's snapshot > > > > > > > > On Jan 22, 2013, at 12:35 PM, Ada Mancuso > wrote: > > > > > > > > My problem is that I have to use openmpi 1.7 rc5 because I'm using > the Java binding mpijava... Is it present in the latest snapshot you told > me? If so where can I find it? > > > > Thanks a lot > > > > Ada > > > > > > > > Il giorno 22/gen/2013 21:03, "Ralph Castain" ha > scritto: > > > >> > > > >> It seems to be working fine for me with the latest 1.7 tarball (not > rc5 - I didn't test that one). Could be there was a problem that has since > been fixed. We are getting ready to release an updated rc, so you might > want to try it (or use the latest nightly 1.7 snapshot). > > > >> > > > >> > > > >> On Jan 22, 2013, at 9:57 AM, Ada Mancuso > wrote: > > > >> > > > >> Hi, > > > >> I'm trying to run my mpi program using open mpi 1.7 rc5 on 4 > machines using the command: > > > >> mpirun -np4 -hostfile file a.out > > > >> but i get the following message errors: > > > >> ORTE_ERROR_LOG: A message is attempting to be sent to a process > whose contact information is unknown in file > ../../../../../ompi/orte/mca/rml/oob/rml_oob_send.c > > > >> attempted to send to [[21341,0],2]: tag 15 > > > >> ORTE_ERROR_LOG: A message is attempting to be sent to a process > whose contact information is unknown in file > ../../../../ompi/orte/mca/grpcomm/base/grpcomm_base_xcast.c > > > >> The file etc/hosts is composed by ipaddress hostname, I have > exchange ssh keys among the machines and ssh login works without requiring > authentication password. Surprisingly if I try to run my program with at > most 2 hosts, and so the file hosts contains only two hosts, it works but > if i try to run my program with more than two hosts i have this error; mp