https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77546

            Bug ID: 77546
           Summary: [5 to 6 regression] C++ software renderer performance
                    drop
           Product: gcc
           Version: 6.2.1
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: middle-end
          Assignee: unassigned at gcc dot gnu.org
          Reporter: tulipawn at gmail dot com
  Target Milestone: ---

Created attachment 39595
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=39595&action=edit
Annotated assembly files

I've found a regression in the linked C++ software renderer on aarch64. 
Doing a `configure && make && make bench` yields:

GCC5 Average value: 37.912720
GCC6 Average value: 35.691360

CPU: ARM Cortex-A53, speed 1536 MHz (estimated)

GCC5

samples  %        linenr info                 image name               app name
                symbol
2760258  67.5931  Screen.cc:76                renderer                 renderer
                unsigned int IlluminatePixel<FatPointPhongAndSoftShadowed,
TriangleCarrier<FatPointPhongAndSoftShadowed> >(FatPointPhongAndSoftShadowed
const&, TriangleCarrier<FatPointPhongAndSoftShadowed> const&, Scene const&,
SDL_Surface*)
682926   16.7235  Rasterizers.cc:249          renderer                 renderer
               
RasterizeScene<FatPointPhongAndSoftShadowed>::DrawTriangles(int, int) const
[clone ._omp_fn.7] [clone .lto_priv.84]
155381    3.8050  (no location information)   libgomp.so.1.0.0         renderer
                /usr/lib/aarch64-linux-gnu/libgomp.so.1.0.0
111829    2.7385  Screen.h:349                renderer                 renderer
                void DrawPixelBasic<4>(int, int, unsigned int)
105202    2.5762  (no location information)   libSDL-1.2.so.0.11.4     renderer
                /usr/lib/aarch64-linux-gnu/libSDL-1.2.so.0.11.4
89315     2.1871  ScanConverter.h:37          renderer                 renderer
                ScanConverter<FatPointPhongAndSoftShadowed,
Screen::AccessProjectionX<FatPointPhongAndSoftShadowed>, 600>::ScanlineAdd(int,
FatPointPhongAndSoftShadowed const&)
80560     1.9727  ScanConverter.h:96          renderer                 renderer
                ScanConverter<FatPointPhongAndSoftShadowed,
Screen::AccessProjectionX<FatPointPhongAndSoftShadowed>, 600>::InnerLoop(int,
int, FatPointPhongAndSoftShadowed const&, FatPointPhongAndSoftShadowed const&)
54032     1.3231  memset.S:56                 libc-2.23.so             renderer
                memset
19974     0.4891  (no location information)   no-vmlinux               renderer
                /no-vmlinux
13407     0.3283  Algebra.h:28                renderer                 renderer
                Matrix3::multiplyRightWith(Vector3 const&) const
2771      0.0679  Light.cc:101                renderer                 renderer
                DrawSceneInShadowBuffer::DrawTriangles(int, int) const [clone
._omp_fn.0] [clone .lto_priv.71]
2253      0.0552  vector.tcc:452              renderer                 renderer
                std::vector<FatPointPhongAndSoftShadowed,
std::allocator<FatPointPhongAndSoftShadowed>
>::_M_fill_insert(__gnu_cxx::__normal_iterator<FatPointPhongAndSoftShadowed*,
std::vector<FatPointPhongAndSoftShadowed,
std::allocator<FatPointPhongAndSoftShadowed> > >, unsigned long,
FatPointPhongAndSoftShadowed const&)
783       0.0192  renderer.cc:169             renderer                 renderer
                main


GCC6

samples  %        linenr info                 image name               app name
                symbol name
2914396  66.8866  Screen.cc:76                renderer                 renderer
                unsigned int IlluminatePixel<FatPointPhongAndSoftShadowed,
TriangleCarrier<FatPointPhongAndSoftShadowed> >(FatPointPhongAndSoftShadowed
const&, TriangleCarrier<FatPointPhongAndSoftShadowed> const&, Scene const&,
SDL_Surface*)
708202   16.2535  Rasterizers.cc:249          renderer                 renderer
               
RasterizeScene<FatPointPhongAndSoftShadowed>::DrawTriangles(int, int) const
[clone ._omp_fn.7] [clone .lto_priv.61]
157237    3.6087  (no location information)   libgomp.so.1.0.0         renderer
                /usr/lib/aarch64-linux-gnu/libgomp.so.1.0.0
106485    2.4439  Screen.h:349                renderer                 renderer
                void DrawPixelBasic<4>(int, int, unsigned int)
104818    2.4056  (no location information)   libSDL-1.2.so.0.11.4     renderer
                /usr/lib/aarch64-linux-gnu/libSDL-1.2.so.0.11.4
95259     2.1862  Types.h:61                  renderer                 renderer
                Vector3::length()
90035     2.0663  ScanConverter.h:37          renderer                 renderer
                ScanConverter<FatPointPhongAndSoftShadowed,
Screen::AccessProjectionX<FatPointPhongAndSoftShadowed>, 600>::ScanlineAdd(int,
FatPointPhongAndSoftShadowed const&)
78224     1.7953  ScanConverter.h:90          renderer                 renderer
                ScanConverter<FatPointPhongAndSoftShadowed,
Screen::AccessProjectionX<FatPointPhongAndSoftShadowed>, 600>::InnerLoop(int,
int, FatPointPhongAndSoftShadowed const&, FatPointPhongAndSoftShadowed const&)
56357     1.2934  memset.S:56                 libc-2.23.so             renderer
                memset
21294     0.4887  (no location information)   no-vmlinux               renderer
                /no-vmlinux
14197     0.3258  Algebra.h:28                renderer                 renderer
                Matrix3::multiplyRightWith(Vector3 const&) const
2842      0.0652  Light.cc:101                renderer                 renderer
                DrawSceneInShadowBuffer::DrawTriangles(int, int) const [clone
._omp_fn.0] [clone .lto_priv.71]
2159      0.0495  vector.tcc:540              renderer                 renderer
                std::vector<FatPointPhongAndSoftShadowed,
std::allocator<FatPointPhongAndSoftShadowed> >::_M_default_append(unsigned
long) [clone .part.131] [clone .lto_priv.57]
756       0.0174  renderer.cc:169             renderer                 renderer
                main


FLAGS used:
-Ofast -mcpu=cortex-a53 -fomit-frame-pointer -fipa-pta -march=armv8-a+crc

Reply via email to