On Wed, Jun 6, 2018 at 11:10 PM Zan Lynx <zl...@acm.org> wrote:
>
> On 06/06/2018 10:22 AM, Dmitry Mikushin wrote:
> > The opinion you've mentioned is common in scientific community. However, in
> > more detail it often surfaces that the used set of GCC compiler options
> > simply does not correspond to that "fast" version of Intel. For instance,
> > when you do "-O3" for Intel it actually corresponds to (at least) "-O3
> > -ffast-math -march=native" of GCC. Omitting "-ffast-math" obviously
> > introduces significant performance gap.
> >
>
> Please note that if your compute cluster uses different models of CPU,
> be extremely careful with -march=native.
>
> I've been bitten by it in VMs, several times. Unless you always run on
> the same system that did the build, you are running a risk of illegal
> instructions.

Yes.  Note this is where ICC has an advantage because it supports
automagically doing runtime versioning based on the CPU instruction
set for vectorized loops.  We only support that in an awkward
explicit way (the manual talks about this in the 'Function Multiversioning'
section).

But in the end it's just a "detail" that can be worked around with
a little inconvenience ;)  (I've yet to see a heterogenous cluster
where the instruction set differences make a performance difference
over choosing the lowest common one)

Richard.

> --
>                 Knowledge is Power -- Power Corrupts
>                         Study Hard -- Be Evil

Reply via email to