Re: [OMPI users] Asymmetric performance with nonblocking, multithreaded communications

2011-11-30 Thread Patrik Jonsson
Replying to my own post, I'd like to add some info: After making the master thread put more of a premium on receiving the missing messages, the problem went away. Both tasks now appear to keep up on the messages sent from the other. However, after about a minute and ~1.5e6 messages exchanged, both

Re: [OMPI users] Open MPI and SLURM_CPUS_PER_TASK

2011-11-30 Thread Ralph Castain
Hi Igor As I recall, this eventually traced back to a change in slurm at some point. I believe the latest interpretation is in line with your suggestion. I believe we didn't change it because nobody seemed to care very much, but I have no objection to including it in the next release. Thanks r

Re: [OMPI users] Accessing OpenMPI processes over Internet using ssh

2011-11-30 Thread Ralph Castain
As I said, please do a quick search on the "user" mailing list. There are numerous discussions there about how to do this. Here is another one that dealt with getting thru the Amazon firewall: http://www.open-mpi.org/community/lists/users/2011/02/15646.php On Nov 30, 2011, at 1:58 PM, Jaison P

Re: [OMPI users] Accessing OpenMPI processes over Internet using ssh

2011-11-30 Thread Jaison Paul
Ralph Castain open-mpi.org> writes: > > This has come up before - I would suggest doing a quick search of "ec2" on our user list. Here is one solution: > On Jun 14, 2011, at 10:50 AM, Barnet Wagman wrote:I've put together a simple system for running OMPI on EC2 (Amazon's cloud computing service)

Re: [OMPI users] Accessing OpenMPI processes over Internet using ssh

2011-11-30 Thread Jaison Paul
Jeff Squyres cisco.com> writes: > > On Nov 30, 2011, at 6:03 AM, Jaison Paul wrote: > > > Yes, we have set up .ssh file on remote EC2 hosts. Is there anything else that we should be taking care of when > dealing with EC2? > > I have heard that Open MPI's TCP latency on EC2 is horrid. I actual

Re: [OMPI users] Program hangs in mpi_bcast

2011-11-30 Thread Jeff Squyres
Fair enough. Thanks anyway! On Nov 30, 2011, at 3:39 PM, Tom Rosmond wrote: > Jeff, > > I'm afraid trying to produce a reproducer of this problem wouldn't be > worth the effort. It is a legacy code that I wasn't involved in > developing and will soon be discarded, so I can't justify spending t

Re: [OMPI users] Program hangs in mpi_bcast

2011-11-30 Thread Tom Rosmond
Jeff, I'm afraid trying to produce a reproducer of this problem wouldn't be worth the effort. It is a legacy code that I wasn't involved in developing and will soon be discarded, so I can't justify spending time trying to understand its behavior better. The bottom line is that it works correctly

Re: [OMPI users] Accessing OpenMPI processes over Internet using ssh

2011-11-30 Thread Jeff Squyres
On Nov 30, 2011, at 3:02 PM, Jaison Paul wrote: > We are not setting up --mca btl_tcp_if_include / --mca oob_tcp_if_include at > all > at the moment. What will be the best setup to access EC2 hosts over internet > for > --mca btl_tcp_if_include / --mca oob_tcp_if_include? I dont understand --mca

Re: [OMPI users] Program hangs in mpi_bcast

2011-11-30 Thread Jeff Squyres
Yes, but I'd like to see a reproducer that requires setting the sync_barrier_before=5. Your reproducers allowed much higher values, IIRC. I'm curious to know what makes that code require such a low value (i.e., 5)... On Nov 30, 2011, at 1:50 PM, Ralph Castain wrote: > FWIW: we already have a

Re: [OMPI users] Accessing OpenMPI processes over Internet using ssh

2011-11-30 Thread Jaison Paul
Ralph Castain open-mpi.org> writes: > > This has come up before - I would suggest doing a quick search of "ec2" on our user list. Here is one solution: > On Jun 14, 2011, at 10:50 AM, Barnet Wagman wrote:I've put together a simple system for running OMPI on EC2 (Amazon's cloud computing service)

Re: [OMPI users] Accessing OpenMPI processes over Internet using ssh

2011-11-30 Thread Jaison Paul
Jeff Squyres cisco.com> writes: > > On Nov 30, 2011, at 6:03 AM, Jaison Paul wrote: > > > Yes, we have set up .ssh file on remote EC2 hosts. Is there anything else that we should be taking care of when > dealing with EC2? > > I have heard that Open MPI's TCP latency on EC2 is horrid. I actual

Re: [OMPI users] Program hangs in mpi_bcast

2011-11-30 Thread Ralph Castain
Oh - and another one at orte/test/mpi/reduce-hang.c On Nov 30, 2011, at 11:50 AM, Ralph Castain wrote: > FWIW: we already have a reproducer from prior work I did chasing this down a > couple of years ago. See orte/test/mpi/bcast_loop.c > > > On Nov 29, 2011, at 9:35 AM, Jeff Squyres wrote: >

Re: [OMPI users] Program hangs in mpi_bcast

2011-11-30 Thread Ralph Castain
FWIW: we already have a reproducer from prior work I did chasing this down a couple of years ago. See orte/test/mpi/bcast_loop.c On Nov 29, 2011, at 9:35 AM, Jeff Squyres wrote: > That's quite weird/surprising that you would need to set it down to *5* -- > that's really low. > > Can you share

[OMPI users] Asymmetric performance with nonblocking, multithreaded communications

2011-11-30 Thread Patrik Jonsson
Hi all, I'm seeing performance issues I don't understand in my multithreaded MPI code, and I was hoping someone could shed some light on this. The code structure is as follows: A computational domain is decomposed into MPI tasks. Each MPI task has a "master thread" that receives messages from the

Re: [OMPI users] Accessing OpenMPI processes over Internet using ssh

2011-11-30 Thread Ralph Castain
This has come up before - I would suggest doing a quick search of "ec2" on our user list. Here is one solution: On Jun 14, 2011, at 10:50 AM, Barnet Wagman wrote: > I've put together a simple system for running OMPI on EC2 (Amazon's cloud > computing service). If you're interested, see > > h

Re: [OMPI users] Accessing OpenMPI processes over Internet using ssh

2011-11-30 Thread Jeff Squyres
On Nov 30, 2011, at 6:03 AM, Jaison Paul wrote: > Yes, we have set up .ssh file on remote EC2 hosts. Is there anything else > that we should be taking care of when dealing with EC2? I have heard that Open MPI's TCP latency on EC2 is horrid. I actually talked with some Amazon / EC2 folks about

Re: [OMPI users] Accessing OpenMPI processes over Internet using ssh

2011-11-30 Thread Jaison Paul
Ralph Castain open-mpi.org> writes: > > > On Nov 24, 2011, at 2:00 AM, Reuti wrote: > Thanks a lot to Ralph and Reuti. Actually we are trying to use EC2 nodes as compute nodes and my local PC as host node. Happy to know that it is OK to use usersomehost.com We used that but failed. Woul