Re: [O-MPI users] Performance of all-to-all on Gbit Ethernet

2006-01-10 Thread Carsten Kutzner
Hi Graham, thanks for fixing it so fast! I have attached a 128 CPU (=32 nodes*4 CPUs) slog file that tests the OpenMPI tuned all-to-all for a message size of 4096 floats (16384 bytes) where the execution times vary between 0.12 and 0.43 seconds. Summary (25-run average, timer resolution 0.01)

Re: [O-MPI users] Performance of all-to-all on Gbit Ethernet

2006-01-07 Thread Graham E Fagg
Hi Carsten, ops, sorry!. There was a memory bug created by me misusing one my own collective topo functions.. which I think was corrupting the MPE logging buffers (and who knows what else). Anyway it should be fixed in the next nightly build/tarball. G On Fri, 6 Jan 2006, Carsten Kutzner wro

Re: [O-MPI users] Performance of all-to-all on Gbit Ethernet

2006-01-06 Thread Carsten Kutzner
On Fri, 6 Jan 2006, Graham E Fagg wrote: > > Looks like the problem is somewhere in the tuned collectives? > > Unfortunately I need a logfile with exactly those :( > > > > Carsten > > I hope not. Carsten can you send me your configure line (not the whole > log) and any other things your set in y

Re: [O-MPI users] Performance of all-to-all on Gbit Ethernet

2006-01-06 Thread Jeff Squyres
On Jan 6, 2006, at 8:13 AM, Carsten Kutzner wrote: Looks like the problem is somewhere in the tuned collectives? Unfortunately I need a logfile with exactly those :( FWIW, we just activated these tuned collectives on the trunk (which will eventually become the 1.1.x series; the tuned collect

Re: [O-MPI users] Performance of all-to-all on Gbit Ethernet

2006-01-06 Thread Graham E Fagg
Looks like the problem is somewhere in the tuned collectives? Unfortunately I need a logfile with exactly those :( Carsten I hope not. Carsten can you send me your configure line (not the whole log) and any other things your set in your .mca conf file. Is this with the changed (custom) deci

Re: [O-MPI users] Performance of all-to-all on Gbit Ethernet

2006-01-06 Thread Carsten Kutzner
On Wed, 4 Jan 2006, Jeff Squyres wrote: > On Jan 4, 2006, at 2:08 PM, Anthony Chan wrote: > > >> Either my program quits without writing the logfile (and without > >> complaining) or it crashes in MPI_Finalize. I get the message > >> "33 additional processes aborted (not shown)". > > > > This is n

Re: [O-MPI users] Performance of all-to-all on Gbit Ethernet

2006-01-04 Thread Jeff Squyres
On Jan 4, 2006, at 2:08 PM, Anthony Chan wrote: Either my program quits without writing the logfile (and without complaining) or it crashes in MPI_Finalize. I get the message "33 additional processes aborted (not shown)". This is not MPE error message. If the logging crashes in MPI_Finalize

Re: [O-MPI users] Performance of all-to-all on Gbit Ethernet

2006-01-04 Thread Anthony Chan
On Wed, 4 Jan 2006, Carsten Kutzner wrote: > On Tue, 3 Jan 2006, Anthony Chan wrote: > > > MPE/MPE2 logging (or clog/clog2) does not impose any limitation on the > > number of processes. Could you explain what difficulty or error > > message you encountered when using >32 processes ? > > Either

Re: [O-MPI users] Performance of all-to-all on Gbit Ethernet

2006-01-04 Thread Graham E Fagg
Thanks Carsten, I have started updating my jumpshot so will let you know as soon as I have some ideas on whats going on. G. ps. I am going offline now for 2 days while travelling On Wed, 4 Jan 2006, Carsten Kutzner wrote: Hi Graham, here are the all-to-all test results with the modification

Re: [O-MPI users] Performance of all-to-all on Gbit Ethernet

2006-01-04 Thread Carsten Kutzner
Hi Graham, here are the all-to-all test results with the modification to the decision routine you suggested yesterday. Now the routine behaves nicely for 128 and 256 float messages on 128 CPUs! For the other sizes one probably wants to keep the original algorithm, since it is faster there. However

Re: [O-MPI users] Performance of all-to-all on Gbit Ethernet

2006-01-04 Thread Carsten Kutzner
On Tue, 3 Jan 2006, Anthony Chan wrote: > MPE/MPE2 logging (or clog/clog2) does not impose any limitation on the > number of processes. Could you explain what difficulty or error > message you encountered when using >32 processes ? Either my program quits without writing the logfile (and without

Re: [O-MPI users] Performance of all-to-all on Gbit Ethernet

2006-01-04 Thread Carsten Kutzner
Hi Peter, We have HP ProCurve 2848 GigE switches here (48 port). The problem is more severe the more nodes (=ports) are involved. It starts to show up at 16 ports for a limited range of message sizes and gets really bad for 32 nodes. The switch has a 96 Gbit/s backplane and should therefore be abl

Re: [O-MPI users] Performance of all-to-all on Gbit Ethernet

2006-01-04 Thread Peter Kjellström
Hello Carsten, Have you considered the possibility that this is the effect of a non-optimal ethernet switch? I don't know how many nodes you need to reproduce it on or if you even have physical access (and opportunity) but popping in another decent 16-port switch for a testrun might be interest

Re: [O-MPI users] Performance of all-to-all on Gbit Ethernet

2006-01-03 Thread Anthony Chan
On Tue, 3 Jan 2006, Carsten Kutzner wrote: > On Tue, 3 Jan 2006, Graham E Fagg wrote: > > > Do you have any tools such as Vampir (or its Intel equivalent) available > > to get a time line graph ? (even jumpshot of one of the bad cases such as > > the 128/32 for 256 floats below would help). > > H

Re: [O-MPI users] Performance of all-to-all on Gbit Ethernet

2006-01-03 Thread Carsten Kutzner
On Tue, 3 Jan 2006, Graham E Fagg wrote: > Do you have any tools such as Vampir (or its Intel equivalent) available > to get a time line graph ? (even jumpshot of one of the bad cases such as > the 128/32 for 256 floats below would help). Hi Graham, I have attached an slog file of an all-to-all

Re: [O-MPI users] Performance of all-to-all on Gbit Ethernet

2006-01-03 Thread Graham E Fagg
Hello Carsten happy new year to you too. On Tue, 3 Jan 2006, Carsten Kutzner wrote: Hi Graham, sorry for the long delay, I was on Christmas holidays. I wish a Happy New Year! (Uh, I think the previous email did not arrive in my postbox (?)) But yes, I am resending it after this reply

Re: [O-MPI users] Performance of all-to-all on Gbit Ethernet

2006-01-03 Thread Carsten Kutzner
Hi Graham, sorry for the long delay, I was on Christmas holidays. I wish a Happy New Year! On Fri, 23 Dec 2005, Graham E Fagg wrote: > > > I have also tried the tuned alltoalls and they are really great!! Only for > > very few message sizes in the case of 4 CPUs on a node one of my alltoalls > >

Re: [O-MPI users] Performance of all-to-all on Gbit Ethernet

2005-12-23 Thread Graham E Fagg
Hi Carsten I have also tried the tuned alltoalls and they are really great!! Only for very few message sizes in the case of 4 CPUs on a node one of my alltoalls performed better. Are these tuned collectives ready to be used for production runs? We are actively testing them on larger systems t

Re: [O-MPI users] Performance of all-to-all on Gbit Ethernet

2005-12-23 Thread Carsten Kutzner
On Tue, 20 Dec 2005, George Bosilca wrote: > On Dec 20, 2005, at 3:19 AM, Carsten Kutzner wrote: > > >> I don't see how you deduct that adding barriers increase the > >> congestion ? It increase the latency for the all-to-all but for me > > > > When I do an all-to-all a lot of times, I see that th

Re: [O-MPI users] Performance of all-to-all on Gbit Ethernet

2005-12-20 Thread George Bosilca
On Dec 20, 2005, at 3:19 AM, Carsten Kutzner wrote: I don't see how you deduct that adding barriers increase the congestion ? It increase the latency for the all-to-all but for me When I do an all-to-all a lot of times, I see that the time for a single all-to-all varies a lot. My time meas

Re: [O-MPI users] Performance of all-to-all on Gbit Ethernet

2005-12-20 Thread Carsten Kutzner
On Mon, 19 Dec 2005, George Bosilca wrote: > Carsten, > > In the Open MPI source code directory there is a collective component > called tuned (ompi/mca/coll/tuned). This component is not enabled by > default right now, but usually it give better performances than the > basic one. You should give

Re: [O-MPI users] Performance of all-to-all on Gbit Ethernet

2005-12-19 Thread George Bosilca
Carsten, In the Open MPI source code directory there is a collective component called tuned (ompi/mca/coll/tuned). This component is not enabled by default right now, but usually it give better performances than the basic one. You should give it a try (go inside and remove the .ompi_ignor

[O-MPI users] Performance of all-to-all on Gbit Ethernet

2005-12-19 Thread Carsten Kutzner
Hello, I am desparately trying to get better all-to-all performance on Gbit Ethernet (flow control is enabled). I have been playing around with several all-to-all schemes and been able to reduce congestion by communicating in an ordered fashion. E.g. the simplest scheme looks like for (i=0; i