Hi Brandon

You must have passwordless ssh setup across the machines.
Check if you can ssh passwordless back and forth across all node pairs,
with the host names or IPs you have in your host.txt file.

Your /etc/host (or whatever Ubuntu uses to match hosts and IPs) must be
consistent (perhaps the same) across the machines.

The same (Open)MPI must be installed on all machines,
or installed on an NFS directory mounted on all machines.

Make sure you use the same MPI to compile (mpicc) and to
run (mpiexec/mpirun).  It is quite common to inadvertently mixup
different flavors/versions, which may come with Linux distributions,
commercial compilers, etc, and sometimes take precedence on your
$PATH.  In doubt, use full path names for both mpicc and mpirun.

It may be easier to run just "hostname" to check functionality:

mpirun -np ${whatever} hostname

If the Ubuntu package doesn't work ...
It easy to build OpenMPI from source, and choose an installation
directory that doesn't interfere with the system (e.g. under your home 
directory).
The README file and the FAQ have clear instructions for that.
It builds fine with gcc/g++/gfortran, if free compilers are your concern.

The OpenMPI FAQ has good suggestions for initial troubleshooting:
http://www.open-mpi.org/faq/

My $0.02
Gus Correa

On Oct 23, 2010, at 10:07 AM, Brandon Fulcher wrote:

> Thank you for the response!
> 
> The code runs on my own machine as well.  Both machines, in fact.  And I did 
> not build MPI but installed the package from the ubuntu repositories.
> 
> The problem occurs when I try to run a job using two machines or simply try 
> to run it on a slave from the master.  
> 
> the actual command I have run along with the output is below:
> 
> mpirun -hostfile hosts.txt ilk
> --------------------------------------------------------------------------
> mpirun noticed that the job aborted, but has no info as to the process
> that caused that situation.
> --------------------------------------------------------------------------
> 
> where hosts.txt contains:
> 192.168.0.2 cpu=2
> 192.168.0.6 cpu=1
> 
> 
> If it matters the same output is given if I define a remote host in the 
> command such as (if I am on 192.168.0.2)
> mpirun  -host 192.168.0.6 ilk
> 
> Now if I run it locally, the job succeeds.  This works from either cpu.
> mpirun  ilk
> 
> 
> Thanks in advance.
> 
> On Fri, Oct 22, 2010 at 11:59 PM, David Zhang <solarbik...@gmail.com> wrote:
> since you said you're new to MPI, what command did you use to run the 2 
> processes?
> 
> 
> On Fri, Oct 22, 2010 at 9:58 PM, David Zhang <solarbik...@gmail.com> wrote:
> your code works on mine machine. could be they way you build mpi.
> 
> On Fri, Oct 22, 2010 at 7:26 PM, Brandon Fulcher <min...@gmail.com> wrote:
> Hi, I am completely new to MPI and am having trouble running a job between 
> two  cpus.
> 
> The same thing happens no matter what MPI job I try to run, but here is a 
> simple 'hello world' style program I am trying to run.
> 
> #include <mpi.h>
> #include <stdio.h>
> 
> int main(int argc, char **argv)
> {
>   int *buf, i, rank, nints, len;
>   char hostname[256];
> 
>   MPI_Init(&argc,&argv);
>   MPI_Comm_rank(MPI_COMM_WORLD, &rank);
>   gethostname(hostname,255);
>   printf("Hello world!  I am process number: %d on host %s\n", rank, 
> hostname);
>   MPI_Finalize();
>   return 0;
> }
> 
> 
> On either CPU, I can successfully compile and run, but when trying to run the 
> program using two CPUS it fails with this output:
> 
> --------------------------------------------------------------------------
> mpirun noticed that the job aborted, but has no info as to the process
> that caused that situation.
> --------------------------------------------------------------------------
> 
> 
> With no additional information or errors,  What can I do to go about finding 
> out what is wrong?
> 
> 
> 
> I have read the FAQ and followed the instructions.  I can ssh into the slave 
> without entering a password and have the libraries installed on both machines.
> 
> The only thing pertinent I could find is this faq 
> http://www.open-mpi.org/faq/?category=running#missing-prereqs  but I do not 
> know if it applies since I have installed open mpi from the Ubuntu 
> repositories and assume the libraries are correctly set. 
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> 
> 
> -- 
> David Zhang
> University of California, San Diego
> 
> 
> 
> -- 
> David Zhang
> University of California, San Diego
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


Reply via email to