Did you remember to set --bind-to-core or --bind-to-socket on the cmd line? Otherwise, the processes are running unbound, which makes a significant difference to performance.
On Jul 9, 2010, at 3:15 AM, Andreas Schäfer wrote: > Maybe I should add that for tests I ran the benchmarks with two MPI > processes: for InfiniBand one process per node and for shared memory > both processes were located on one node. > > > -- > ========================================================== > Andreas Schäfer > HPC and Grid Computing > Chair of Computer Science 3 > Friedrich-Alexander-Universität Erlangen-Nürnberg, Germany > +49 9131 85-27910 > PGP/GPG key via keyserver > I'm a bright... http://www.the-brights.net > ========================================================== > > (\___/) > (+'.'+) > (")_(") > This is Bunny. Copy and paste Bunny into your > signature to help him gain world domination! > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users