Replying to my own post, I'd like to add some info:
After making the master thread put more of a premium on receiving the
missing messages, the problem went away. Both tasks now appear to keep
up on the messages sent from the other. However, after about a minute
and ~1.5e6 messages exchanged, both
Hi Igor
As I recall, this eventually traced back to a change in slurm at some point. I
believe the latest interpretation is in line with your suggestion. I believe we
didn't change it because nobody seemed to care very much, but I have no
objection to including it in the next release.
Thanks
r
As I said, please do a quick search on the "user" mailing list. There are
numerous discussions there about how to do this. Here is another one that dealt
with getting thru the Amazon firewall:
http://www.open-mpi.org/community/lists/users/2011/02/15646.php
On Nov 30, 2011, at 1:58 PM, Jaison P
Ralph Castain open-mpi.org> writes:
>
> This has come up before - I would suggest doing a quick search of "ec2" on our
user list. Here is one solution:
> On Jun 14, 2011, at 10:50 AM, Barnet Wagman wrote:I've put together a simple
system for running OMPI on EC2 (Amazon's cloud computing service)
Jeff Squyres cisco.com> writes:
>
> On Nov 30, 2011, at 6:03 AM, Jaison Paul wrote:
>
> > Yes, we have set up .ssh file on remote EC2 hosts. Is there anything else
that we should be taking care of when
> dealing with EC2?
>
> I have heard that Open MPI's TCP latency on EC2 is horrid. I actual
Fair enough. Thanks anyway!
On Nov 30, 2011, at 3:39 PM, Tom Rosmond wrote:
> Jeff,
>
> I'm afraid trying to produce a reproducer of this problem wouldn't be
> worth the effort. It is a legacy code that I wasn't involved in
> developing and will soon be discarded, so I can't justify spending t
Jeff,
I'm afraid trying to produce a reproducer of this problem wouldn't be
worth the effort. It is a legacy code that I wasn't involved in
developing and will soon be discarded, so I can't justify spending time
trying to understand its behavior better. The bottom line is that it
works correctly
On Nov 30, 2011, at 3:02 PM, Jaison Paul wrote:
> We are not setting up --mca btl_tcp_if_include / --mca oob_tcp_if_include at
> all
> at the moment. What will be the best setup to access EC2 hosts over internet
> for
> --mca btl_tcp_if_include / --mca oob_tcp_if_include? I dont understand --mca
Yes, but I'd like to see a reproducer that requires setting the
sync_barrier_before=5. Your reproducers allowed much higher values, IIRC.
I'm curious to know what makes that code require such a low value (i.e., 5)...
On Nov 30, 2011, at 1:50 PM, Ralph Castain wrote:
> FWIW: we already have a
Ralph Castain open-mpi.org> writes:
>
> This has come up before - I would suggest doing a quick search of "ec2" on our
user list. Here is one solution:
> On Jun 14, 2011, at 10:50 AM, Barnet Wagman wrote:I've put together a simple
system for running OMPI on EC2 (Amazon's cloud computing service)
Jeff Squyres cisco.com> writes:
>
> On Nov 30, 2011, at 6:03 AM, Jaison Paul wrote:
>
> > Yes, we have set up .ssh file on remote EC2 hosts. Is there anything else
that we should be taking care of when
> dealing with EC2?
>
> I have heard that Open MPI's TCP latency on EC2 is horrid. I actual
Oh - and another one at orte/test/mpi/reduce-hang.c
On Nov 30, 2011, at 11:50 AM, Ralph Castain wrote:
> FWIW: we already have a reproducer from prior work I did chasing this down a
> couple of years ago. See orte/test/mpi/bcast_loop.c
>
>
> On Nov 29, 2011, at 9:35 AM, Jeff Squyres wrote:
>
FWIW: we already have a reproducer from prior work I did chasing this down a
couple of years ago. See orte/test/mpi/bcast_loop.c
On Nov 29, 2011, at 9:35 AM, Jeff Squyres wrote:
> That's quite weird/surprising that you would need to set it down to *5* --
> that's really low.
>
> Can you share
Hi all,
I'm seeing performance issues I don't understand in my multithreaded
MPI code, and I was hoping someone could shed some light on this.
The code structure is as follows: A computational domain is decomposed
into MPI tasks. Each MPI task has a "master thread" that receives
messages from the
This has come up before - I would suggest doing a quick search of "ec2" on our
user list. Here is one solution:
On Jun 14, 2011, at 10:50 AM, Barnet Wagman wrote:
> I've put together a simple system for running OMPI on EC2 (Amazon's cloud
> computing service). If you're interested, see
>
> h
On Nov 30, 2011, at 6:03 AM, Jaison Paul wrote:
> Yes, we have set up .ssh file on remote EC2 hosts. Is there anything else
> that we should be taking care of when dealing with EC2?
I have heard that Open MPI's TCP latency on EC2 is horrid. I actually talked
with some Amazon / EC2 folks about
Ralph Castain open-mpi.org> writes:
>
>
> On Nov 24, 2011, at 2:00 AM, Reuti wrote:
>
Thanks a lot to Ralph and Reuti.
Actually we are trying to use EC2 nodes as compute nodes and my local PC as host
node.
Happy to know that it is OK to use usersomehost.com
We used that but failed. Woul
17 matches
Mail list logo