Hi jeff,
Thanks for the firewall tip. I tried it while allowing all tip traffic
and got interesting and preplexing result. Here's what's interesting
(BTW, I got rid of "LogLevel DEBUG3" from ./ssh/config on this run):
[tsakai@ip-10-203-21-132 ~]$
[tsakai@ip-10-203-21-132 ~]$ mpirun --app
Hi,
Am 10.02.2011 um 22:03 schrieb Tena Sakai:
> Hi Reuti,
>
> Thanks for suggesting "LogLevel DEBUG3." I did so and complete
> session is captured in the attached file.
>
> What I did is much similar to what I have done before: verify
> that ssh works and then run mpirun command. In my a bit
Your prior mails were about ssh issues, but this one sounds like you might have
firewall issues.
That is, the "orted" command attempts to open a TCP socket back to mpirun for
various command and control reasons. If it is blocked from doing so by a
firewall, Open MPI won't run. In general, you
Hi Reuti,
Thanks for suggesting "LogLevel DEBUG3." I did so and complete
session is captured in the attached file.
What I did is much similar to what I have done before: verify
that ssh works and then run mpirun command. In my a bit lengthy
session log, there are two responses from "LogLevel DE
Running the following program on two nodes connected by Infiniband and
OMPI 1.5.1 (openib BTL), one task per node:
#include
#include
int main(int argc, char** argv) {
int rank, size, i, *buf1, *buf2;
MPI_Request* reqs;
MPI_Init(&argc, &argv);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MP
I forgot to mention that this was tested with 3 or 4 ranks, connected via
TCP.
-- Jeremiah Willcock
On Thu, 10 Feb 2011, Jeremiah Willcock wrote:
Here is a small test case that hits the bug on 1.4.1:
#include
int arr[1142];
int main(int argc, char** argv) {
int rank, my_size;
MPI_Init(&
Here is a small test case that hits the bug on 1.4.1:
#include
int arr[1142];
int main(int argc, char** argv) {
int rank, my_size;
MPI_Init(&argc, &argv);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
my_size = (rank == 1) ? 1142 : 1088;
MPI_Bcast(arr, my_size, MPI_INT, 0, MPI_COMM_WORLD);
Hi,
Am 10.02.2011 um 19:11 schrieb Tena Sakai:
>> your local machine is Linux like, but the execution hosts
>> are Macs? I saw the /Users/tsakai/... in your output.
>
> No, my environment is entirely linux. The path to my home
> directory on one host (blitzen) has been known as /Users/tsakai,
>
Hi David,
Thank you for your reply.
> just to be sure:
> having the ssh public keys in other computer's authorized_key file.
> ssh keys generated without passphrases
Yes, as evidenced by my session dialogue, invoking ssh manually is
not a problem. I cannot use mpirun command (which I believe
us
Hi Reuti,
> your local machine is Linux like, but the execution hosts
> are Macs? I saw the /Users/tsakai/... in your output.
No, my environment is entirely linux. The path to my home
directory on one host (blitzen) has been known as /Users/tsakai,
despite it is an nfs mount from vixen (which is
hello everyone,
i have a network of systems connected over lan with each computer running
ubuntu. openmpi 1.4.x is installed on 1 machine and the installation is
mounted on other nodes through Networking File System(NFS). the source
program and compiled file(a.out) are present in the mounted direct
FYI, I am having trouble finding a small test case that will trigger this
on 1.5; I'm either getting deadlocks or MPI_ERR_TRUNCATE, so it could have
been fixed. What are the triggering rules for different broadcast
algorithms? It could be that only certain sizes or only certain BTLs
trigger i
Nifty! Yes, I agree that that's a poor error message. It's probably
(unfortunately) being propagated up from the underlying point-to-point system,
where an ERR_IN_STATUS would actually make sense.
I'll file a ticket about this. Thanks for the heads up.
On Feb 9, 2011, at 4:49 PM, Jeremiah W
I typically see these kinds of errors when there's an Open MPI version mismatch
between the nodes, and/or if there are slightly different flavors of Linux
installed on each node (i.e., you're technically in a heterogeneous situation,
but you're trying to run a single application binary). Can yo
Hello
> I've a program that allways works fine, but i'm trying it on a new cluster
> and fails when I execute it on more than one machine.
> I mean, if I execute alone on each host, everything works fine.
> radic@santacruz:~/gaps/caso3-i1$ mpirun -np 3 ../test parcorto.txt
>
> But when I execute
>
Hi,
your local machine is Linux like, but the execution hosts are Macs? I saw the
/Users/tsakai/... in your output.
a) executing a command on them is also working, e.g.: ssh
domU-12-31-39-07-35-21 ls
Am 10.02.2011 um 07:08 schrieb Tena Sakai:
> Hi,
>
> I have made a bit of progress(?)...
> I
I don't really know what the problem is. It seems like you're doing things
correctly. I'm almost sure you've done all of the following, but just to be
sure:
having the ssh public keys in other computer's authorized_key file.
ssh keys generated without passphrases
On Wed, Feb 9, 2011 at 10:08 PM,
Hi,
I have made a bit of progress(?)...
I made a config file in my .ssh directory on the cloud. It looks like:
# machine A
Host domU-12-31-39-07-35-21.compute-1.internal
HostName domU-12-31-39-07-35-21
BatchMode yes
IdentityFile /home/tsakai/.ssh/tsakai
ChallengeResponseAu
18 matches
Mail list logo