Tim, 

Thanks for your message. I was however not clear about your suggestions. Would 
appreciate if you could clarify.

You say," So, if you want a sane comparison but aren't willing to study the 
compiler manuals, you might use (if your source code doesn't violate the 
aliasing rules) mpiicpc -prec-div -prec-sqrt -ansi-alias  and at least (if your 
linux compiler is g++) mpiCC -O2 possibly with some of the other options I 
mentioned earlier."
###From your response above, I understand to use, for Intel, this syntax: 
"mpiicpc -prec-div -prec-sqrt -ansi-alias" and for OPENMPI use "mpiCC -O2". I 
am not certain about the other options you mention.  

###Also, I presently use a hostfile while submitting my mpirun. Each node has 
four slots and my hostfile was "nodename slots=4". My compile code is mpiCC -o 
xxx.xpp <filename>. 

If you have as ancient a g++ as your indication of FC3 implies, it really isn't 
fair to compare it with a currently supported compiler.
###Do you suggest upgrading the current installation of g++? Would that help?

###How do I ensure that all 4 slots are active when i submit a mpirun -np 4 
<filename> command. When I do "top", I notice that all 4 slots are active. I 
noticed this when I did "top" with the Intel machine too, that is, it showed 
four slots active. 

Thank you..ashwin.



-----Original Message-----
From: users-boun...@open-mpi.org on behalf of Tim Prince
Sent: Tue 7/12/2011 9:21 PM
To: us...@open-mpi.org
Subject: Re: [OMPI users] OpenMPI vs Intel Efficiency question

On 7/12/2011 7:45 PM, Mohan, Ashwin wrote:
> Hi,
>
> I noticed that the exact same code took 50% more time to run on OpenMPI
> than Intel. I use the following syntax to compile and run:
> Intel MPI Compiler: (Redhat Fedora Core release 3 (Heidelberg), Kernel
> version: Linux 2.6.9-1.667smp x86_64**
>
>       mpiicpc -o xxxx.cpp<filename>  -lmpi
>
> OpenMPI 1.4.3: (Centos 5.5 w/ python 2.4.3, Kernel version: Linux
> 2.6.18-194.el5 x86_64)**
>
>               mpiCC xxxx.cpp -o<filename
>
> MPI run command:
>
>               mpirun -np 4<filename>
>
>
> **Other hardware specs**
>
>      processor       : 0
>      vendor_id       : GenuineIntel
>      cpu family      : 15
>      model           : 3
>      model name      : Intel(R) Xeon(TM) CPU 3.60GHz
>      stepping        : 4
>      cpu MHz         : 3591.062
>      cache size      : 1024 KB
>      physical id     : 0
>      siblings        : 2
>      core id         : 0
>      cpu cores       : 1
>      apicid          : 0
>      fpu             : yes
>      fpu_exception   : yes
>      cpuid level     : 5
>      wp              : yes
>      flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr
> pge mca cmov pat pse36
>      clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall lmconstant_tsc
> pni monitor ds_cpl est tm2
>       cid xtpr
>       bogomips        : 7182.12
>      clflush size    : 64
>      cache_alignment : 128
>      address sizes   : 36 bits physical, 48 bits virtual
>      power management:
>
> Can the issue of efficiency be deciphered from the above info?
>
> Does the compiler flags have an effect on the efficiency of the
> simulation. If so, what flags maybe useful to check to be included for
> Open MPI.
The default options for icpc are roughly equivalent to the quite 
aggressive choice
g++ -fno-strict-aliasing -ffast-math -fnocx-limited-range -O3 
-funroll-loops --param max-unroll-times=2
while you apparently used default -O0 for your mpiCC (if it is g++), 
neither of which is a very good initial choice for performance analysis. 
So, if you want a sane comparison but aren't willing to study the 
compiler manuals, you might use (if your source code doesn't violate the 
aliasing rules)
mpiicpc -prec-div -prec-sqrt -ansi-alias
and at least
(if your linux compiler is g++)
mpiCC -O2
possibly with some of the other options I mentioned earlier.
If you have as ancient a g++ as your indication of FC3 implies, it 
really isn't fair to compare it with a currently supported compiler.

Then, Intel MPI, by default, would avoid using HyperThreading, even 
though you have it enabled on your CPU, so, I suppose, if you are 
running on a single core, it will be rotating among your 4 MPI processes 
1 at a time.  The early Intel HyperThread CPUs typically took 15% longer 
to run MPI jobs when running 2 processes per core.
>
> Will including MPICH2 increase efficiency in running simulations using
> OpenMPI?
>
You have to choose a single MPI.  Having MPICH2 installed shouldn't 
affect performance of OpenMPI or Intel MPI, except to break your 
installation if you don't keep things sorted out.
OpenMPI and Intel MPI normally perform very close, if using equivalent 
settings, when working within the environments for which both are suited.
-- 
Tim Prince
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


Reply via email to