Steve Ellcey <sell...@marvell.com> 于2019年2月16日周六 上午1:53写道:

> On Fri, 2019-02-15 at 17:48 +0800, Jun Ma wrote:
> >
> > ICC is doing much more than GCC in ipo, especially memory layout
> > optimizations. See https://software.intel.com/en-us/node/522667.
> > ICC is more aggressive in array transposition/structure splitting
> > /field reordering. However, these optimizations have been removed
> > from GCC long time ago.
> > As for case lbm_r, IIRC a loop with memory access which stride is 20 is
> > most time-consuming.  ICC will optimize the array(maybe structure?)
> > and vectorize the loop under ipo.
> >
> > Thanks
> > Jun
>
> Interesting.  I tried using '-qno-opt-mem-layout-trans' on ICC
> along with '-Ofast -ipo' and that had no affect on the speed.  I also
> tried '-no-vec' and that had no affect either.  The only thing that
> slowed down ICC was '-ip-no-inlining' or '-fno-inline'.  I see that
> '-Ofast -ipo' resulted in everything (except libc functions) getting
> inlined into the main program when using ICC.  GCC did not do that, but
> if I forced it to by using the always_inline attribute, GCC could
> inline everything into main the way ICC does.  But that did not speed
> up the GCC executable.
>
> Steve Ellcey
> sell...@marvell.com

 you can use '-qopt-report' to see which optimizations has been applied by
icc.

Thanks
Jun

Reply via email to