Hello everyone,

I am attempting to run a single program on 32 cores split across 4
computers (So each computer has 8 cores). I am attempting to use mpich for
this. I currently am just testing on 2 computers, I have the program
installed on both, as well as mpich installed on both. I have created a
register key and can login in using ssh into the other computer without a
password. I have come across 2 problems. One, when I attempt to connect
using the mpirun -np 3 --host a (the IP of the computer I am attempting to
connect to) hostname
I recieve the error
 unable to connect from "localhost.localdomain" to "localhost.localdomain"

This is indicating my computers "localhost.localdomain" is attempting to
connect to another "localhost.localdomain". How can I change this so that
it connects via my IP to the other computers IP?

Secondly, I attempted to use a host file instead using the hydra process
wiki. I created a hosts file with just the IP of the computer I am
attempting to connect to. When I type in the command mpiexec -f hosts -n 4
./applic

I get this error
[mpiexec@localhost.localdomain] HYDU_parse_hostfile
(./utils/args/args.c:323): unable to open host file: hosts

along with other errors of unable to parse hostfile, match handler etc. I
assume this is all due to it being unable to read the host file. Is there
any specific place I should save my hosts file? I have it saved directly on
my Desktop. I have attempted to indicate the full path where it is located,
but I still get the same error.

For the first problem, I have read that I need to change /etc/hosts
manually by using the sudo command to manually enter the IP of the computer
I am attempting to connect to in the /etc/hosts file. Thank you in advance.

Sincerely,
Sam
_______________________________________________
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Reply via email to