Re: [OMPI users] Factor of 10 loss in performance with 1.3.x

2009-04-10 Thread Steve Kargl
On Fri, Apr 10, 2009 at 06:13:43PM -0400, Jeff Squyres wrote: > On Apr 10, 2009, at 5:30 PM, Steve Kargl wrote: > > >Thanks for looking into this issue. As a side note, FreeBSD 7.1 > >and higher has the cpuset_getaffinity/cpuset_setaffinity system > >calls. I suspect that at some point openmpi c

Re: [OMPI users] Factor of 10 loss in performance with 1.3.x

2009-04-10 Thread Jeff Squyres
On Apr 10, 2009, at 5:30 PM, Steve Kargl wrote: Thanks for looking into this issue. As a side note, FreeBSD 7.1 and higher has the cpuset_getaffinity/cpuset_setaffinity system calls. I suspect that at some point openmpi can have a opal/mca/paffinity/freebsd directory with an appropriate set of

Re: [OMPI users] Factor of 10 loss in performance with 1.3.x

2009-04-10 Thread Jeff Squyres
On Apr 10, 2009, at 6:10 PM, Steve Kargl wrote: > I'll fix. I don't know if it'll make the cut for 1.3.2 or not. I applied your patch to openmpi-1.3.2a1r20942. It built fine and the running my test indicate that it fixes the problem. Ecellent. :-) -- Jeff Squyres Cisco Systems

Re: [OMPI users] Factor of 10 loss in performance with 1.3.x

2009-04-10 Thread Steve Kargl
On Fri, Apr 10, 2009 at 05:10:29PM -0400, Jeff Squyres wrote: > On Apr 7, 2009, at 4:25 PM, Mostyn Lewis wrote: > > >Does OpenMPI know about the number of CPUS per node for FreeBSD? > > > > This is exactly the right question: apparently it does not. > > Specifically, it looks like we have a bad

Re: [OMPI users] Factor of 10 loss in performance with 1.3.x

2009-04-10 Thread Steve Kargl
On Fri, Apr 10, 2009 at 05:10:29PM -0400, Jeff Squyres wrote: > On Apr 7, 2009, at 4:25 PM, Mostyn Lewis wrote: > > >Does OpenMPI know about the number of CPUS per node for FreeBSD? > > > > This is exactly the right question: apparently it does not. > > Specifically, it looks like we have a bad

Re: [OMPI users] Factor of 10 loss in performance with 1.3.x

2009-04-10 Thread Jeff Squyres
On Apr 7, 2009, at 4:25 PM, Mostyn Lewis wrote: Does OpenMPI know about the number of CPUS per node for FreeBSD? This is exactly the right question: apparently it does not. Specifically, it looks like we have a bad configure test in the "posix" paffinity component which triggers it to not

Re: [OMPI users] Factor of 10 loss in performance with 1.3.x

2009-04-07 Thread Steve Kargl
On Tue, Apr 07, 2009 at 02:23:45PM -0600, Ralph Castain wrote: > It isn't in a file - unless you specify it, OMPI will set it > automatically based on the number of procs on the node vs. what OMPI > thinks are the number of available processors. The question is: why > does OMPI not correctly

Re: [OMPI users] Factor of 10 loss in performance with 1.3.x

2009-04-07 Thread Mostyn Lewis
Does OpenMPI know about the number of CPUS per node for FreeBSD? DM On Tue, 7 Apr 2009, Ralph Castain wrote: I would really suggest looking at George's note first as I think you are chasing your tail here. It sounds like the most likely problem is that OMPI thinks you are oversubscribed and i

Re: [OMPI users] Factor of 10 loss in performance with 1.3.x

2009-04-07 Thread Ralph Castain
It isn't in a file - unless you specify it, OMPI will set it automatically based on the number of procs on the node vs. what OMPI thinks are the number of available processors. The question is: why does OMPI not correctly know the number of processors on your machine? I don't remember now,

Re: [OMPI users] Factor of 10 loss in performance with 1.3.x

2009-04-07 Thread Steve Kargl
On Tue, Apr 07, 2009 at 01:40:13PM -0600, Ralph Castain wrote: > I would really suggest looking at George's note first as I think you > are chasing your tail here. It sounds like the most likely problem is > that OMPI thinks you are oversubscribed and is setting sched_yield > accordingly. whi

Re: [OMPI users] Factor of 10 loss in performance with 1.3.x

2009-04-07 Thread Ralph Castain
I would really suggest looking at George's note first as I think you are chasing your tail here. It sounds like the most likely problem is that OMPI thinks you are oversubscribed and is setting sched_yield accordingly. which would fully account for these diffs. Note that the methods for set

Re: [OMPI users] Factor of 10 loss in performance with 1.3.x

2009-04-07 Thread Steve Kargl
On Tue, Apr 07, 2009 at 03:18:31PM -0400, George Bosilca wrote: > Steve, > > I spotted a strange value for the mpi_yield_when_idle MCA parameter. 1 > means your processor is oversubscribed, and this trigger a call to > sched_yield after each check on the SM. Are you running the job > oversub

Re: [OMPI users] Factor of 10 loss in performance with 1.3.x

2009-04-07 Thread Steve Kargl
On Tue, Apr 07, 2009 at 12:00:55PM -0700, Mostyn Lewis wrote: > Steve, > > Did you rebuild 1.2.9? As I see you have static libraries, maybe there's > a lurking phthread or something else that may have changed over time? > > DM Yes. I downloaded 1.2.9, 1.3, and 1.3.1, all within minutes of each

Re: [OMPI users] Factor of 10 loss in performance with 1.3.x

2009-04-07 Thread Ethan Mallove
Hi Steve, I see improvements in 1.3.1 as compared to 1.2.9 in Netpipe results. The below Open MPI installations were compiled with the same compiler, configure options, run on the same cluster, and run with the same MCA parameters. (Note, ClusterTools 8.2 is essentially 1.3.1r20828.) http://www

Re: [OMPI users] Factor of 10 loss in performance with 1.3.x

2009-04-07 Thread George Bosilca
Steve, I spotted a strange value for the mpi_yield_when_idle MCA parameter. 1 means your processor is oversubscribed, and this trigger a call to sched_yield after each check on the SM. Are you running the job oversubscribed? If not it looks like somehow we don't correctly identify that th

Re: [OMPI users] Factor of 10 loss in performance with 1.3.x

2009-04-07 Thread Mostyn Lewis
Steve, Did you rebuild 1.2.9? As I see you have static libraries, maybe there's a lurking phthread or something else that may have changed over time? DM On Tue, 7 Apr 2009, Steve Kargl wrote: On Tue, Apr 07, 2009 at 09:10:21AM -0700, Eugene Loh wrote: Steve Kargl wrote: I can rebuild 1.2.9

Re: [OMPI users] Factor of 10 loss in performance with 1.3.x

2009-04-07 Thread Ralph Castain
[node20.cimu.org:90002] btl_sm_bandwidth=900 (default value) [node20.cimu.org:90002] btl_sm_latency=100 (default value) All these params do is influence the selection logic for deciding which BTL to use to send the data. Since you directed OMPI to only use sm, they are irrelevant. On Apr

Re: [OMPI users] Factor of 10 loss in performance with 1.3.x

2009-04-07 Thread Steve Kargl
On Tue, Apr 07, 2009 at 09:10:21AM -0700, Eugene Loh wrote: > Steve Kargl wrote: > > >I can rebuild 1.2.9 and 1.3.1. Is there any particular configure > >options that I should enable/disable? > > I hope someone else will chime in here, because I'm somewhat out of > ideas. All I'm saying is tha

Re: [OMPI users] Factor of 10 loss in performance with 1.3.x

2009-04-07 Thread Steve Kargl
On Tue, Apr 07, 2009 at 08:39:20AM -0700, Eugene Loh wrote: > Iain Bason wrote: > > >But maybe Steve should try 1.3.2 instead? Does that have your > >improvements in it? > > 1.3.2 has the single-queue implementation and automatic sizing of the sm > mmap file, both intended to fix problems at

Re: [OMPI users] Factor of 10 loss in performance with 1.3.x

2009-04-07 Thread Steve Kargl
On Tue, Apr 07, 2009 at 08:00:39AM -0700, Eugene Loh wrote: > Iain Bason wrote: > > >There are a bunch changes in the shared memory module between 1.2.9 > >and 1.3.1. One significant change is the introduction of the "sendi" > >internal interface. I believe George Bosilca did the initial >

Re: [OMPI users] Factor of 10 loss in performance with 1.3.x

2009-04-07 Thread Peter Kjellstrom
On Tuesday 07 April 2009, Eugene Loh wrote: > Iain Bason wrote: > > But maybe Steve should try 1.3.2 instead? Does that have your > > improvements in it? > > 1.3.2 has the single-queue implementation and automatic sizing of the sm > mmap file, both intended to fix problems at large np. At np=2, y

Re: [OMPI users] Factor of 10 loss in performance with 1.3.x

2009-04-07 Thread Eugene Loh
Steve Kargl wrote: I can rebuild 1.2.9 and 1.3.1. Is there any particular configure options that I should enable/disable? I hope someone else will chime in here, because I'm somewhat out of ideas. All I'm saying is that 10-usec latencies on sm with 1.3.0 or 1.3.1 are out of line with what o

Re: [OMPI users] Factor of 10 loss in performance with 1.3.x

2009-04-07 Thread Eugene Loh
Iain Bason wrote: But maybe Steve should try 1.3.2 instead? Does that have your improvements in it? 1.3.2 has the single-queue implementation and automatic sizing of the sm mmap file, both intended to fix problems at large np. At np=2, you shouldn't expect to see much difference. And th

Re: [OMPI users] Factor of 10 loss in performance with 1.3.x

2009-04-07 Thread Iain Bason
On Apr 7, 2009, at 11:00 AM, Eugene Loh wrote: Iain Bason wrote: There are a bunch changes in the shared memory module between 1.2.9 and 1.3.1. One significant change is the introduction of the "sendi" internal interface. I believe George Bosilca did the initial implementation. Thi

Re: [OMPI users] Factor of 10 loss in performance with 1.3.x

2009-04-07 Thread Eugene Loh
Iain Bason wrote: There are a bunch changes in the shared memory module between 1.2.9 and 1.3.1. One significant change is the introduction of the "sendi" internal interface. I believe George Bosilca did the initial implementation. This is just a wild guess, but maybe there is somethin

Re: [OMPI users] Factor of 10 loss in performance with 1.3.x

2009-04-07 Thread Iain Bason
There are a bunch changes in the shared memory module between 1.2.9 and 1.3.1. One significant change is the introduction of the "sendi" internal interface. I believe George Bosilca did the initial implementation. This is just a wild guess, but maybe there is something about sendi that i

Re: [OMPI users] Factor of 10 loss in performance with 1.3.x

2009-04-06 Thread Steve Kargl
On Mon, Apr 06, 2009 at 02:04:16PM -0700, Eugene Loh wrote: > Steve Kargl wrote: > > >I recently upgraded OpenMPI from 1.2.9 to 1.3 and then 1.3.1. > >One of my colleagues reported a dramatic drop in performance > >with one of his applications. My investigation shows a factor > >of 10 drop in com

Re: [OMPI users] Factor of 10 loss in performance with 1.3.x

2009-04-06 Thread Eugene Loh
Steve Kargl wrote: I recently upgraded OpenMPI from 1.2.9 to 1.3 and then 1.3.1. One of my colleagues reported a dramatic drop in performance with one of his applications. My investigation shows a factor of 10 drop in communication over the memory bus. I've placed a figure that iilustrates the