[OMPI users] OpenMPI run with the SGE launcher, orte PE calrification
Hi All, We have installed OpenMPI in our cluster, I can see from "ompi_info" gridengine support(FAQ:22) is there, now we are creating PE as mentioned in FAQ % qconf -sp orte start_proc_args /bin/true stop_proc_args/bin/true ... just want to know anybody successfully running in SGE using this PE?? from my mpich PE I can see start/stop arguments as show below start_proc_args /opt/gridengine/mpi/startmpi.sh -catch_rsh $pe_hostfile stop_proc_args/opt/gridengine/mpi/stopmpi.sh thanks, -bala- Finding fabulous fares is fun. Let Yahoo! FareChase search your favorite travel sites to find flight and hotel bargains. http://farechase.yahoo.com/promo-generic-14795097
Re: [OMPI users] OpenMPI run with the SGE launcher, orte PE calrification
On 3/28/07, Bala wrote: % qconf -sp orte start_proc_args /bin/true stop_proc_args/bin/true just want to know anybody successfully running in SGE using this PE?? Hi, yes, I have a working installation of openmpi 1.2 with SGE 6.0u9. from my mpich PE I can see start/stop arguments as show below start_proc_args /opt/gridengine/mpi/startmpi.sh -catch_rsh $pe_hostfile stop_proc_args/opt/gridengine/mpi/stopmpi.sh Take a look at these scripts, they don't do anything fancy, they just prepare the machinefile and the rsh wrapper for mpich. Openmpi does the right thing by itself by looking at SGE's environment variables. Regards, Götz Waschk - DESY Zeuthen -- AL I:40: Do what thou wilt shall be the whole of the Law.
[OMPI users] Odd behavior with slots=4
Curious performance when using OpenMPI 1.2 to run Amber 9 on my Xserve Xeon 5100 cluster. Each cluster node is a dual socket, dual- core system. The cluster is also running with Myrinet 2000 with MX. I'm just running some tests with one of Amber's benchmarks. It seems that my hostfiles effect the performance of the application. I tried variations of the hostfile to see what would happen. I did a straight mpirun with no mca options set using: "mpirun -np 32" variation 1: hostname real0m35.391s variation 2: hostname slots=4 real0m45.698s variation 3: hostname slots=2 real0m38.761s It seems that the best performance I achieve is when I use variation 1 with only the hostname and execute the command: "mpirun --hostfile hostfile -np 32 " . Its shockingly about 13% better performance than if I use the hostfile with a syntax of "hostname slots=4". I also tried variations of in my mpirun command, here are the times: straight mpirun with not mca options real0m45.698s and "-mca mpi_yield_when_idle 0" real0m44.912s and "-mca mtl mx -mca pml cm" real0m45.002s Warner Yuen Scientific Computing Consultant Apple Computer email: wy...@apple.com Tel: 408.718.2859 Fax: 408.715.0133
[OMPI users] Measuring MPI message size used by application
Hi, What is the best way of getting statistics on the size of MPI messages being sent/received by my OpenMPI-using application? I'm guessing MPE is one route but is there anything built into OpenMPI that will give me this specific statistic? Thanks, -stephen -- Stephen Mulcahy, Applepie Solutions Ltd, Innovation in Business Center, GMIT, Dublin Rd, Galway, Ireland. http://www.aplpi.com
Re: [OMPI users] Odd behavior with slots=4
On 3/28/07, Warner Yuen wrote: variation 1: hostname real0m35.391s variation 2: hostname slots=4 real0m45.698s variation 3: hostname slots=2 real0m38.761s Hi Warner, how many nodes does your cluster have? I assume it is using only one process per node by default. With slots=4 you might hit the Xeon bottle neck. Regards, Götz Waschk -- AL I:40: Do what thou wilt shall be the whole of the Law.
Re: [OMPI users] Measuring MPI message size used by application
Stephen, There are a huge number of MPI profiling tools out there. My preference will be something small, fast and where the output is in human readable text format (and not fancy graphics). The tools I'm talking about is called mpiP (http://mpip.sourceforge.net/). It's not Open MPI specific, but it's really simple to use. george. On Mar 28, 2007, at 10:10 AM, stephen mulcahy wrote: Hi, What is the best way of getting statistics on the size of MPI messages being sent/received by my OpenMPI-using application? I'm guessing MPE is one route but is there anything built into OpenMPI that will give me this specific statistic? Thanks, -stephen -- Stephen Mulcahy, Applepie Solutions Ltd, Innovation in Business Center, GMIT, Dublin Rd, Galway, Ireland. http://www.aplpi.com ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] Odd behavior with slots=4
There are multiple answers possible here. One is related to the over- subscription of your cluster, but I expect that there are at least 4 cores per node if you want to use the slots=4 option. The real question is what is the communication pattern in this benchmark ? and how this match the distribution of the processes you use ? As a matter of fact, if when you have XX processes per node, and all of them will try to send a message to a remote process (here remote means on another node), then they will have to share the physical Myrinet link, which of course will lead to lower global performances when XX increase (from 1, to 2 and then 4). And this is true without regard on how you use the MX driver (via the Open MPI MTL or BTL). Open MPI provide 2 options to allow you to distribute the processes based on different criteria. Try to use -bynode and -byslot to see if this affect the overall performances. Thanks, george. On Mar 28, 2007, at 9:56 AM, Warner Yuen wrote: Curious performance when using OpenMPI 1.2 to run Amber 9 on my Xserve Xeon 5100 cluster. Each cluster node is a dual socket, dual- core system. The cluster is also running with Myrinet 2000 with MX. I'm just running some tests with one of Amber's benchmarks. It seems that my hostfiles effect the performance of the application. I tried variations of the hostfile to see what would happen. I did a straight mpirun with no mca options set using: "mpirun -np 32" variation 1: hostname real0m35.391s variation 2: hostname slots=4 real0m45.698s variation 3: hostname slots=2 real0m38.761s It seems that the best performance I achieve is when I use variation 1 with only the hostname and execute the command: "mpirun --hostfile hostfile -np 32 " . Its shockingly about 13% better performance than if I use the hostfile with a syntax of "hostname slots=4". I also tried variations of in my mpirun command, here are the times: straight mpirun with not mca options real0m45.698s and "-mca mpi_yield_when_idle 0" real0m44.912s and "-mca mtl mx -mca pml cm" real0m45.002s Warner Yuen Scientific Computing Consultant Apple Computer email: wy...@apple.com Tel: 408.718.2859 Fax: 408.715.0133 ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users