I resolved the name resolution issue and re-ran it but it still hangs at
the send-receive calls.
I ran it using:
/usr/local/bin/mpirun --mca btl_tcp_port_min_v4 36900 -mca
btl_tcp_port_range_v4 32 --mca btl_base_verbose 30 --mca
OMPI_mca_mpi_preconnect_all 1 -np 2 -hetero -H localhost,10.11.14.2
Greg Fischer wrote:
(I apologize in advance for the simplistic/newbie
question.)
I'm performing an ALLREDUCE operation on a multi-dimensional array.
This operation is the biggest bottleneck in the code, and I'm wondering
if there's a way to do it more efficiently than what I'm doing now.
H
(I apologize in advance for the simplistic/newbie question.)
I'm performing an ALLREDUCE operation on a multi-dimensional array. This
operation is the biggest bottleneck in the code, and I'm wondering if
there's a way to do it more efficiently than what I'm doing now. Here's a
representative exa
I did try to ping using the hostname but i can't..can that be an issue..??
both of them are sitting on the same subnet !!! let me check if i can
resolve this thing..
> Hmm,
>
> On another angle, could this be a name resolution issue? Perhaps
> apex-backpack
> isn't able to resolve fuji.local
Jonathan Dursi wrote:
So to summarize:
OpenMPI 1.3.2 + gcc4.4.0
Test problem with periodic (left neighbour of proc 0 is proc N-1)
Sendrecv()s:
Default always hangs in Sendrecv after random number of iterations
Turning off sm (-mca btl self,tcp) not observed to hang
Using -mca btl_sm_n
I can do it ..please send me the URL..i can rebuild ompi and see what the
output looks like..
> Ok, I think we're outta options here :-( -- our debugging output is
> not sufficient to tell us what is going wrong. If I make a mercurial
> repo with some extra debugging output in it, can you check i
Hmm,
On another angle, could this be a name resolution issue? Perhaps apex-backpack
isn't able to resolve fuji.local and visa versa. Can you ping between the two of
them using their hostnames rather then their IPs?
-Joshua Bernstein
Senior Software Engineer
Penguin Computing
Pallab Datta wr
Ok, I think we're outta options here :-( -- our debugging output is
not sufficient to tell us what is going wrong. If I make a mercurial
repo with some extra debugging output in it, can you check it out and
build it? That way we can run it and add relevant printf's in the
Right places to
Yes it came up when i put the verbose mode in i.e. the debug output..
yes i knew its privileged so thats why i explicity asked it to connect to
a higher port but still it blocks there..:(
> On Sep 24, 2009, at 12:54 PM, Pallab Datta wrote:
>
>> Yes I had tried that initially it (apex-backpack) was
On Sep 24, 2009, at 12:54 PM, Pallab Datta wrote:
Yes I had tried that initially it (apex-backpack) was trying to
connect
the Mac (10.11.14.203) at port number 4 which is too low. So that's
why I
made the port range higher..
Port 4? OMPI should never connect at port 4; it's privileged. W
Yes I had tried that initially it (apex-backpack) was trying to connect
the Mac (10.11.14.203) at port number 4 which is too low. So that's why I
made the port range higher..
> Have you tried running without limiting the port range?
>
> On Sep 24, 2009, at 12:39 PM, Pallab Datta wrote:
>
>> Hi Al
Have you tried running without limiting the port range?
On Sep 24, 2009, at 12:39 PM, Pallab Datta wrote:
Hi All,
Yes I can ping and ssh from apex-backpack to my Mac (fuji.local).
I fixed the wireless broadcast to reflect the same on both ends
(10.11.14.255) but still the problem persists.
I
Hi All,
Yes I can ping and ssh from apex-backpack to my Mac (fuji.local).
I fixed the wireless broadcast to reflect the same on both ends
(10.11.14.255) but still the problem persists.
I have tried other wireless adapters as well. But no luck till far.
Please let me know what can be done...
regar
(putting this back on the list where others can reply as well, and if
we solve it, the solution will be google-ized)
According to your debug output:
[apex-backpack:31956] btl: tcp: attempting to connect() to address
10.11.14.203 on port 9360
It *is* trying to connect to the right IP address
14 matches
Mail list logo