Dear Gus Correa,

Thank you in advance for your detailed answer.
I was busy checking your steps. But unfortunately still I have the problem.

1) Yes, I have sudo access to server, when I want to run the test only my
two instances are active.

2) There is no problem with running hello program simultaneously on two
instances, but someone told me these programs cannot check some factors.

Instances are pure installation of ubuntu server 12.04, by the way I
disabled "ufw". There are two notes here, openmpi uses ssh and I can
connect with no password from master to slave. And one more odd thing i​s
that​ the order is important in myhosts file, ie, allways the second
machine abort the process, even when I am in the master and master is
second in the file, it reports that master aborted.

3,4) I checked it, actually, I did everything from the first step. Just
installing Atlas and OpenMPI from packages with 64 switch to configure.

5) I used -np 4 with hello, is this sufficient?​

6) Yes, I checked auto-tuning (without input file) too.

One thing that I noticed is that a "vnet" created for each instance on the
main server. I ran these two commands:
mirun -np 2 --hostfile myhosts --mca btl_tcp_if_include eth0,lo hpcc
mirun -np 2 --hostfile myhosts --mca btl_tcp_if_exclude vnet0,vnet1 hpcc

in this case I didn't receive anything, ie, no error nor anything in output
file, I waited for hours but nothing happened. can these vnets cause the
problem?

Really Thank you for your consideration,
Best Regards,
Reza

Reply via email to