On Fri, Feb 15, 2019 at 9:02 AM Ian Lance Taylor <i...@golang.org> wrote:
> On Fri, Feb 15, 2019 at 4:46 AM Hi-Angel <hiangel...@gmail.com> wrote: > > > > I never could understand, why field reordering was removed from GCC? I > > mean, I know that it's prohibited in C and C++, but, sure, GCC can > > detect whether it possibly can influence application behavior, and if > > not, just do the reorder. > > > > The veto is important to C/C++ as programming languages, but not to > > machine code that is being generated from them. As long as app can't > > detect that its fields were reordered through means defined by C/C++, > > field reordering by compiler is fine, isn't it? > > In my opinion field reordering is very hard for the compiler to do > correctly and trivial for a human programmer to do correctly. So in > practice the best approach is for the compiler, or some other tool, to > say "you should reorder the fields here." As far as I can see, the > only real reason to implement field reordering in a compiler is for > benchmark cracking, since benchmarks typically don't let you modify > the source code. It's not a useful optimization in practice other > than for benchmarks. > Hasn't GNAT sorted Ada elements in records (e.g. structures) by size since near its initial addition to GCC in the mid-90s? This results in the largest elements up front and minimizes the need for alignment gaps. I know Ada is traditionally more strongly typed than C/C++, but tf it can be done for Ada programs reliably, why could it not be reliable in C? > > (Array transformations and struct splitting, on the other hand, can be > useful.) > --joel > > Ian > > > > > On Fri, 15 Feb 2019 at 12:49, Jun Ma <majun4950...@gmail.com> wrote: > > > > > > Bin.Cheng <amker.ch...@gmail.com> 于2019年2月15日周五 下午5:12写道: > > > > > > > On Fri, Feb 15, 2019 at 3:30 AM Steve Ellcey <sell...@marvell.com> > wrote: > > > > > > > > > > I have a question about SPEC CPU 2017 and what GCC can and cannot > do > > > > > with -flto. As part of some SPEC analysis I am doing I found that > with > > > > > -Ofast, ICC and GCC were not that far apart (especially spec int > rate, > > > > > spec fp rate was a slightly larger difference). > > > > > > > > > > But when I added -ipo to the ICC command and -flto to the GCC > command, > > > > > the difference got larger. In particular the 519.lbm_r was more > than > > > > > twice as fast with ICC and -ipo, but -flto did not help GCC at all. > > > > > > > > > > There are other tests that also show this type of improvement with > -ipo > > > > > like 538.imagick_r, 544.nab_r, 525.x264_r, 531.deepsjeng_r, and > p> > > > 548.exchange2_r, but none are as dramatic as 519.lbm_r. Anyone > have > > > > > any idea on what ICC is doing that GCC is missing? Is GCC just not > > > > > agressive enough with its inlining? > > > > > > > > IIRC Jun did some investigation before? CCing. > > > > > > > > Thanks, > > > > bin > > > > > > > > > > Steve Ellcey > > > > > sell...@marvell.com > > > > > > ICC is doing much more than GCC in ipo, especially memory layout > > > optimizations. See https://software.intel.com/en-us/node/522667. > > > ICC is more aggressive in array transposition/structure splitting > > > /field reordering. However, these optimizations have been removed > > > from GCC long time ago. > > > As for case lbm_r, IIRC a loop with memory access which stride is 20 is > > > most time-consuming. ICC will optimize the array(maybe structure?) > > > and vectorize the loop under ipo. > > > > > > Thanks > > > Jun >