[OMPI users] OPENMPI_ORTE_LOG_ERROR

2013-01-22 Thread Ada Mancuso
Hi,
I'm trying to run my mpi program using open mpi 1.7 rc5 on 4 machines using
the command:
mpirun -np4 -hostfile file a.out
but i get the following message errors:
ORTE_ERROR_LOG: A message is attempting to be sent to a process whose
contact information is unknown in file
../../../../../ompi/orte/mca/rml/oob/rml_oob_send.c
attempted to send to [[21341,0],2]: tag 15
ORTE_ERROR_LOG: A message is attempting to be sent to a process whose
contact information is unknown in file
../../../../ompi/orte/mca/grpcomm/base/grpcomm_base_xcast.c
The file etc/hosts is composed by ipaddress hostname, I have exchange ssh
keys among the machines and ssh login works without requiring
authentication password. Surprisingly if I try to run my program with at
most 2 hosts, and so the file hosts contains only two hosts, it works but
if i try to run my program with more than two hosts i have this error; mpi
works well on each machine and I also tried to run my program with
different couple of machines in order to be sure that no machine could be
the problem.
Can you help me please?
Ada


Re: [OMPI users] OPENMPI_ORTE_LOG_ERROR

2013-01-22 Thread Ada Mancuso
My problem is that I have to use openmpi 1.7 rc5 because I'm using the Java
binding mpijava... Is it present in the latest snapshot you told me? If so
where can I find it?
Thanks a lot
Ada
Il giorno 22/gen/2013 21:03, "Ralph Castain"  ha scritto:

> It seems to be working fine for me with the latest 1.7 tarball (not rc5 -
> I didn't test that one). Could be there was a problem that has since been
> fixed. We are getting ready to release an updated rc, so you might want to
> try it (or use the latest nightly 1.7 snapshot).
>
>
> On Jan 22, 2013, at 9:57 AM, Ada Mancuso  wrote:
>
> Hi,
> I'm trying to run my mpi program using open mpi 1.7 rc5 on 4 machines
> using the command:
> mpirun -np4 -hostfile file a.out
> but i get the following message errors:
> ORTE_ERROR_LOG: A message is attempting to be sent to a process whose
> contact information is unknown in file
> ../../../../../ompi/orte/mca/rml/oob/rml_oob_send.c
> attempted to send to [[21341,0],2]: tag 15
> ORTE_ERROR_LOG: A message is attempting to be sent to a process whose
> contact information is unknown in file
> ../../../../ompi/orte/mca/grpcomm/base/grpcomm_base_xcast.c
> The file etc/hosts is composed by ipaddress hostname, I have exchange ssh
> keys among the machines and ssh login works without requiring
> authentication password. Surprisingly if I try to run my program with at
> most 2 hosts, and so the file hosts contains only two hosts, it works but
> if i try to run my program with more than two hosts i have this error; mpi
> works well on each machine and I also tried to run my program with
> different couple of machines in order to be sure that no machine could be
> the problem.
> Can you help me please?
> Ada
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>


Re: [OMPI users] OPENMPI_ORTE_LOG_ERROR

2013-01-22 Thread Ada Mancuso
Thanks a lot I will try it.
Il giorno 22/gen/2013 21:49, "Ralph Castain"  ha scritto:

> Ouch - no, you'd have to take it from the developer's trunk, either via
> svn checkout or the nightly developer's snapshot
>
> On Jan 22, 2013, at 12:35 PM, Ada Mancuso  wrote:
>
> My problem is that I have to use openmpi 1.7 rc5 because I'm using the
> Java binding mpijava... Is it present in the latest snapshot you told me?
> If so where can I find it?
> Thanks a lot
> Ada
> Il giorno 22/gen/2013 21:03, "Ralph Castain"  ha
> scritto:
>
>> It seems to be working fine for me with the latest 1.7 tarball (not rc5 -
>> I didn't test that one). Could be there was a problem that has since been
>> fixed. We are getting ready to release an updated rc, so you might want to
>> try it (or use the latest nightly 1.7 snapshot).
>>
>>
>> On Jan 22, 2013, at 9:57 AM, Ada Mancuso  wrote:
>>
>> Hi,
>> I'm trying to run my mpi program using open mpi 1.7 rc5 on 4 machines
>> using the command:
>> mpirun -np4 -hostfile file a.out
>> but i get the following message errors:
>> ORTE_ERROR_LOG: A message is attempting to be sent to a process whose
>> contact information is unknown in file
>> ../../../../../ompi/orte/mca/rml/oob/rml_oob_send.c
>> attempted to send to [[21341,0],2]: tag 15
>> ORTE_ERROR_LOG: A message is attempting to be sent to a process whose
>> contact information is unknown in file
>> ../../../../ompi/orte/mca/grpcomm/base/grpcomm_base_xcast.c
>> The file etc/hosts is composed by ipaddress hostname, I have exchange ssh
>> keys among the machines and ssh login works without requiring
>> authentication password. Surprisingly if I try to run my program with at
>> most 2 hosts, and so the file hosts contains only two hosts, it works but
>> if i try to run my program with more than two hosts i have this error; mpi
>> works well on each machine and I also tried to run my program with
>> different couple of machines in order to be sure that no machine could be
>> the problem.
>> Can you help me please?
>> Ada
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>>
>>
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>


Re: [OMPI users] OPENMPI_ORTE_LOG_ERROR

2013-01-23 Thread Ada Mancuso
Hi,
I've installed the latest snapshot taken from svn developer's trunk but I
had the same problems. This is my configuration:

   - Ubuntu 2.6.38-8 kernel
   - Openssh_5.8p1 openssl 0.9.8o
   - Libtool version 2.4
   - Open mpi 1.7 rc5 and latest snapshots.

Do you think my problem could be related with the operating system used or
with any parameter or configuration? I've also checked the ssh log file but
I didn't find any problem.
Thanks in advance
Ada



Il giorno martedì 22 gennaio 2013, Ralph Castain ha scritto:
>
> Ouch - no, you'd have to take it from the developer's trunk, either via
svn checkout or the nightly developer's snapshot
>
> On Jan 22, 2013, at 12:35 PM, Ada Mancuso  wrote:
>
> My problem is that I have to use openmpi 1.7 rc5 because I'm using the
Java binding mpijava... Is it present in the latest snapshot you told me?
If so where can I find it?
> Thanks a lot
> Ada
>
> Il giorno 22/gen/2013 21:03, "Ralph Castain"  ha
scritto:
>>
>> It seems to be working fine for me with the latest 1.7 tarball (not rc5
- I didn't test that one). Could be there was a problem that has since been
fixed. We are getting ready to release an updated rc, so you might want to
try it (or use the latest nightly 1.7 snapshot).
>>
>>
>> On Jan 22, 2013, at 9:57 AM, Ada Mancuso  wrote:
>>
>> Hi,
>> I'm trying to run my mpi program using open mpi 1.7 rc5 on 4 machines
using the command:
>> mpirun -np4 -hostfile file a.out
>> but i get the following message errors:
>> ORTE_ERROR_LOG: A message is attempting to be sent to a process whose
contact information is unknown in file
../../../../../ompi/orte/mca/rml/oob/rml_oob_send.c
>> attempted to send to [[21341,0],2]: tag 15
>> ORTE_ERROR_LOG: A message is attempting to be sent to a process whose
contact information is unknown in file
../../../../ompi/orte/mca/grpcomm/base/grpcomm_base_xcast.c
>> The file etc/hosts is composed by ipaddress hostname, I have exchange
ssh keys among the machines and ssh login works without requiring
authentication password. Surprisingly if I try to run my program with at
most 2 hosts, and so the file hosts contains only two hosts, it works but
if i try to run my program with more than two hosts i have this error; mpi
works well on each machine and I also tried to run my program with
different couple of machines in order to be sure that no machine could be
the problem.
>> Can you help me please?
>> Ada
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>>
>>
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>


Re: [OMPI users] OPENMPI_ORTE_LOG_ERROR

2013-01-23 Thread Ada Mancuso
I'm sure that openmpi works, morever my problem happens only with more than
2 slaves (on different machines while in local it greatly works with any
number of slaves).
Thanks
Ada
Il giorno 23/gen/2013 14:04, "Jeff Squyres (jsquyres)" 
ha scritto:

> Are you able to run the C examples in the examples/ directory from the
> tarball?
>
> Our README suggests the following:
>
> -
> When verifying a new Open MPI installation, we recommend running three
> tests:
>
> 1. Use "mpirun" to launch a non-MPI program (e.g., hostname or uptime)
>across multiple nodes.
>
> 2. Use "mpirun" to launch a trivial MPI program that does no MPI
>communication (e.g., the hello_c program in the examples/ directory
>in the Open MPI distribution).
>
> 3. Use "mpirun" to launch a trivial MPI program that sends and
>receives a few MPI messages (e.g., the ring_c program in the
>examples/ directory in the Open MPI distribution).
>
> If you can run all three of these tests successfully, that is a good
> indication that Open MPI built and installed properly.
> -
>
>
> On Jan 23, 2013, at 7:41 AM, Ada Mancuso 
>  wrote:
>
> > Hi,
> > I've installed the latest snapshot taken from svn developer's trunk but
> I had the same problems. This is my configuration:
> >   • Ubuntu 2.6.38-8 kernel
> >   • Openssh_5.8p1 openssl 0.9.8o
> >   • Libtool version 2.4
> >   • Open mpi 1.7 rc5 and latest snapshots.
> > Do you think my problem could be related with the operating system used
> or with any parameter or configuration? I've also checked the ssh log file
> but I didn't find any problem.
> > Thanks in advance
> > Ada
> >
> >
> >
> > Il giorno martedì 22 gennaio 2013, Ralph Castain ha scritto:
> > >
> > > Ouch - no, you'd have to take it from the developer's trunk, either
> via svn checkout or the nightly developer's snapshot
> > >
> > > On Jan 22, 2013, at 12:35 PM, Ada Mancuso 
> wrote:
> > >
> > > My problem is that I have to use openmpi 1.7 rc5 because I'm using the
> Java binding mpijava... Is it present in the latest snapshot you told me?
> If so where can I find it?
> > > Thanks a lot
> > > Ada
> > >
> > > Il giorno 22/gen/2013 21:03, "Ralph Castain"  ha
> scritto:
> > >>
> > >> It seems to be working fine for me with the latest 1.7 tarball (not
> rc5 - I didn't test that one). Could be there was a problem that has since
> been fixed. We are getting ready to release an updated rc, so you might
> want to try it (or use the latest nightly 1.7 snapshot).
> > >>
> > >>
> > >> On Jan 22, 2013, at 9:57 AM, Ada Mancuso 
> wrote:
> > >>
> > >> Hi,
> > >> I'm trying to run my mpi program using open mpi 1.7 rc5 on 4 machines
> using the command:
> > >> mpirun -np4 -hostfile file a.out
> > >> but i get the following message errors:
> > >> ORTE_ERROR_LOG: A message is attempting to be sent to a process whose
> contact information is unknown in file
> ../../../../../ompi/orte/mca/rml/oob/rml_oob_send.c
> > >> attempted to send to [[21341,0],2]: tag 15
> > >> ORTE_ERROR_LOG: A message is attempting to be sent to a process whose
> contact information is unknown in file
> ../../../../ompi/orte/mca/grpcomm/base/grpcomm_base_xcast.c
> > >> The file etc/hosts is composed by ipaddress hostname, I have exchange
> ssh keys among the machines and ssh login works without requiring
> authentication password. Surprisingly if I try to run my program with at
> most 2 hosts, and so the file hosts contains only two hosts, it works but
> if i try to run my program with more than two hosts i have this error; mpi
> works well on each machine and I also tried to run my program with
> different couple of machines in order to be sure that no machine could be
> the problem.
> > >> Can you help me please?
> > >> Ada
> > >> ___
> > >> users mailing list
> > >> us...@open-mpi.org
> > >> http://www.open-mpi.org/mailman/listinfo.cgi/users
> > >>
> > >>
> > >>
> > >> ___
> > >> users mailing list
> > >> us...@open-mpi.org
> > >> http://www.open-mpi.org/mailman/listinfo.cgi/users
> > >
> > > ___
> > > users mailing list
> > > us...@open-mpi.org
> > > http://www.open-mpi.org/mailman/listinfo.cgi/users
> > >
> > >
> > ___
> > users mailing list
> > us...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> --
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
>
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>


Re: [OMPI users] OPENMPI_ORTE_LOG_ERROR

2013-01-23 Thread Ada Mancuso
Yes I can but with at most two machines as slave and one machine as master,
If I try to add another one as slave I get those errors.
Il giorno 23/gen/2013 14:38, "Jeff Squyres (jsquyres)" 
ha scritto:

> I'm not sure I understand you.  Does Open MPI work across multiple
> machines?  I.e., can you do all three of those steps across multiple
> machines?
>
> On Jan 23, 2013, at 8:16 AM, Ada Mancuso 
>  wrote:
>
> > I'm sure that openmpi works, morever my problem happens only with more
> than 2 slaves (on different machines while in local it greatly works with
> any number of slaves).
> > Thanks
> > Ada
> >
> > Il giorno 23/gen/2013 14:04, "Jeff Squyres (jsquyres)" <
> jsquy...@cisco.com> ha scritto:
> > Are you able to run the C examples in the examples/ directory from the
> tarball?
> >
> > Our README suggests the following:
> >
> > -
> > When verifying a new Open MPI installation, we recommend running three
> > tests:
> >
> > 1. Use "mpirun" to launch a non-MPI program (e.g., hostname or uptime)
> >across multiple nodes.
> >
> > 2. Use "mpirun" to launch a trivial MPI program that does no MPI
> >communication (e.g., the hello_c program in the examples/ directory
> >in the Open MPI distribution).
> >
> > 3. Use "mpirun" to launch a trivial MPI program that sends and
> >receives a few MPI messages (e.g., the ring_c program in the
> >examples/ directory in the Open MPI distribution).
> >
> > If you can run all three of these tests successfully, that is a good
> > indication that Open MPI built and installed properly.
> > -
> >
> >
> > On Jan 23, 2013, at 7:41 AM, Ada Mancuso 
> >  wrote:
> >
> > > Hi,
> > > I've installed the latest snapshot taken from svn developer's trunk
> but I had the same problems. This is my configuration:
> > >   • Ubuntu 2.6.38-8 kernel
> > >   • Openssh_5.8p1 openssl 0.9.8o
> > >   • Libtool version 2.4
> > >   • Open mpi 1.7 rc5 and latest snapshots.
> > > Do you think my problem could be related with the operating system
> used or with any parameter or configuration? I've also checked the ssh log
> file but I didn't find any problem.
> > > Thanks in advance
> > > Ada
> > >
> > >
> > >
> > > Il giorno martedì 22 gennaio 2013, Ralph Castain ha scritto:
> > > >
> > > > Ouch - no, you'd have to take it from the developer's trunk, either
> via svn checkout or the nightly developer's snapshot
> > > >
> > > > On Jan 22, 2013, at 12:35 PM, Ada Mancuso 
> wrote:
> > > >
> > > > My problem is that I have to use openmpi 1.7 rc5 because I'm using
> the Java binding mpijava... Is it present in the latest snapshot you told
> me? If so where can I find it?
> > > > Thanks a lot
> > > > Ada
> > > >
> > > > Il giorno 22/gen/2013 21:03, "Ralph Castain"  ha
> scritto:
> > > >>
> > > >> It seems to be working fine for me with the latest 1.7 tarball (not
> rc5 - I didn't test that one). Could be there was a problem that has since
> been fixed. We are getting ready to release an updated rc, so you might
> want to try it (or use the latest nightly 1.7 snapshot).
> > > >>
> > > >>
> > > >> On Jan 22, 2013, at 9:57 AM, Ada Mancuso 
> wrote:
> > > >>
> > > >> Hi,
> > > >> I'm trying to run my mpi program using open mpi 1.7 rc5 on 4
> machines using the command:
> > > >> mpirun -np4 -hostfile file a.out
> > > >> but i get the following message errors:
> > > >> ORTE_ERROR_LOG: A message is attempting to be sent to a process
> whose contact information is unknown in file
> ../../../../../ompi/orte/mca/rml/oob/rml_oob_send.c
> > > >> attempted to send to [[21341,0],2]: tag 15
> > > >> ORTE_ERROR_LOG: A message is attempting to be sent to a process
> whose contact information is unknown in file
> ../../../../ompi/orte/mca/grpcomm/base/grpcomm_base_xcast.c
> > > >> The file etc/hosts is composed by ipaddress hostname, I have
> exchange ssh keys among the machines and ssh login works without requiring
> authentication password. Surprisingly if I try to run my program with at
> most 2 hosts, and so the file hosts contains only two hosts, it works but
> if i try to run my program with more than two hosts i have this error; mp