On Tue, Oct 22, 2013 at 10:21:32AM -0400, David Edelsohn wrote: > On Mon, Oct 21, 2013 at 10:42 PM, Vladimir Makarov <vmaka...@redhat.com> > wrote: > > >> I would say lets add -mlra, but make the default OFF for the time being. > >> We > >> can always switch the default later. > > > > Sure, if you know some LRA problems it should not be on default. Moreover, > > if we still have the problems when releasing gcc4.9, I think we should > > exclude any possibility for a user to use LRA for ppc. I don't want to have > > GGC-4.9 users blaming LRA. > > > > But adding LRA to PPC on the trunk (switched OFF by default) earlier could > > help me a lot to work on the issues. > > My main concern was disrupting Mike. If Mike is comfortable with > adding LRA disabled by default, it is okay with me. > > The patch mostly adds lra_in_progress, which will not have any effect > while LRA remains disabled. > > My one question about the patch is: > > - [(set (match_operand:DI 0 "reg_or_mem_operand" "=&r,Z,??&r") > + [(set (match_operand:DI 0 "reg_or_mem_operand" "=&r,Z,&r") > > which may cause register preferencing problems for bswap when LRA is not used. > > The rest of the patch is okay. > > Thanks, David
Yeah, I can see a whole round of tuning issues, and everywhere reload_in_progress is used, add lra_in_progress. Because of the Advance Toolchain, RHEL, and SLES, we will need to still deal with the original register allocator. Vlad, this is part of a message I had sent David, and I thought you were on the CC list about LRA. I haven't looked in detail what the changes are at this point. I did do some builds and comparisons. It looks like there are definately problems with 32-bit fortran and decimal floating point (and likely long double using IBM's double double format). If somebody has some cycles, it may be useful digging into why we get these failures. Note, I have some sort of configuration problem in running dealII, so it isn't run right now: Spec 2006, 64-bit, 3 runs, picking the middle, power7 options: Benchmark Type Percent 400.perlbench int 96.74% 401.bzip2 int 100.09% 403.gcc int 99.94% 429.mcf int 99.21% 445.gobmk int 99.33% 456.hmmer int 98.34% 458.sjeng int 99.68% 462.libquantum int 101.48% 464.h264ref int 101.40% 471.omnetpp int 100.28% 473.astar int 100.09% 483.xalancbmk int 98.28% 410.bwaves fp 98.11% 416.gamess fp 101.31% 433.milc fp 99.43% 434.zeusmp fp 103.53% 435.gromacs fp 109.63% 436.cactusADM fp 99.53% 437.leslie3d fp 101.23% 444.namd fp 103.42% 447.dealII fp ------ 450.soplex fp 99.14% 453.povray fp 99.66% 454.calculix fp 97.17% 459.GemsFDTD fp 100.88% 465.tonto fp 101.18% 470.lbm fp 99.83% 481.wrf fp 93.38% 482.sphinx3 fp 100.82% Spec INT int 99.57% Spec FP except 447.dealII fp 100.43% Perlbench, calculix, and wrf are slower. Zeusmp, gromacs, and Namd are faster. Unfortunately, the profiling tools on my system seem to abort when I run 32-bit benchmarks, so I haven't gotten the numbers recently (nor had time to get the tools team to look at it). In terms of building 32-bit, 3 benchmarks don't build with LRA: gamess, dealII (note in 64-bit dealII builds, it just doesn't run correctly), and wrf. Lets see. In gamess, I see: /home/meissner/fsf-install-ppc64/gcc-4_9-lra/bin/gfortran -c -o ormas1.fppized.o -g -save-temps=obj -ffast-math -Ofast -mveclibabi=mass -mcpu=power7 -mrecip=rsqrt -fpeel-loops -funroll-loops -ftree-vectorize -fvect-cost-model -fno-aggressive-loop-optimizations -mlra -m32 ormas1.fppized.f ormas1.fppized.f: In function 'maktabs': ormas1.fppized.f:2281:0: internal compiler error: in check_rtl, at lra.c:2036 END ^ 0x105a08ef check_rtl /home/meissner/fsf-src/gcc-4_9-lra/gcc/lra.c:2036 0x105a2bcb lra(_IO_FILE*) /home/meissner/fsf-src/gcc-4_9-lra/gcc/lra.c:2432 0x10552933 do_reload /home/meissner/fsf-src/gcc-4_9-lra/gcc/ira.c:4686 0x10552933 rest_of_handle_reload /home/meissner/fsf-src/gcc-4_9-lra/gcc/ira.c:4815 0x10552933 execute /home/meissner/fsf-src/gcc-4_9-lra/gcc/ira.c:4844 Please submit a full bug report, In dealII we see: /home/meissner/fsf-install-ppc64/gcc-4_9-lra/bin/g++ -c -o sparse_matrix_ez.float.o -DSPEC_CPU -DNDEBUG -Iinclude -DBOOST_DISABLE_THREADS -Ddeal_II_dimension=3 -g -save-temps=obj -ffast-math -Ofast -mveclibabi=mass -mcpu=po wer7 -mrecip=rsqrt -fpeel-loops -funroll-loops -ftree-vectorize -fvect-cost-model -fno-aggressive-loop-optimizations -mlra -m32 -DSPEC_CPU_LINUX -include cstddef sparse_matrix_ez.float.cc quadrature_lib.cc: In constructor 'QGauss<dim>::QGauss(unsigned int) [with int dim = 1]': quadrature_lib.cc:95:1: internal compiler error: in check_rtl, at lra.c:2036 } ^ 0x1073272f check_rtl /home/meissner/fsf-src/gcc-4_9-lra/gcc/lra.c:2036 0x10734a0b lra(_IO_FILE*) /home/meissner/fsf-src/gcc-4_9-lra/gcc/lra.c:2432 0x106e4773 do_reload /home/meissner/fsf-src/gcc-4_9-lra/gcc/ira.c:4686 0x106e4773 rest_of_handle_reload /home/meissner/fsf-src/gcc-4_9-lra/gcc/ira.c:4815 0x106e4773 execute /home/meissner/fsf-src/gcc-4_9-lra/gcc/ira.c:4844 Please submit a full bug report, with preprocessed source if appropriate. Please include the complete backtrace with any bug report. See <http://gcc.gnu.org/bugs.html> for instructions. specmake: *** [quadrature_lib.o] Error 1 specmake: *** Waiting for unfinished jobs.... polynomial.cc: In member function 'Polynomials::Polynomial<number> Polynomials::Polynomial<number>::derivative() const [with number = long double]': polynomial.cc:282:3: internal compiler error: in check_rtl, at lra.c:2036 } ^ 0x1073272f check_rtl /home/meissner/fsf-src/gcc-4_9-lra/gcc/lra.c:2036 0x10734a0b lra(_IO_FILE*) /home/meissner/fsf-src/gcc-4_9-lra/gcc/lra.c:2432 0x106e4773 do_reload /home/meissner/fsf-src/gcc-4_9-lra/gcc/ira.c:4686 0x106e4773 rest_of_handle_reload /home/meissner/fsf-src/gcc-4_9-lra/gcc/ira.c:4815 0x106e4773 execute /home/meissner/fsf-src/gcc-4_9-lra/gcc/ira.c:4844 Please submit a full bug report, with preprocessed source if appropriate. Please include the complete backtrace with any bug report. See <http://gcc.gnu.org/bugs.html> for instructions. specmake: *** [polynomial.o] Error 1 In wrf, we get: /home/meissner/fsf-install-ppc64/gcc-4_9-lra/bin/gfortran -c -o module_radiation_driver.fppized.o -I. -I./netcdf/include -g -save-temps=obj -ffast-math -Ofast -mveclibabi=mass -mcpu=power7 -mrecip=rsqrt -fpeel-loops -funroll -loops -ftree-vectorize -fvect-cost-model -fno-aggressive-loop-optimizations -mlra -m32 module_radiation_driver.fppized.f90 module_diffusion_em.fppized.f90: In function 'cal_deform_and_div': module_diffusion_em.fppized.f90:829:0: internal compiler error: in check_rtl, at lra.c:2036 END SUBROUTINE cal_deform_and_div ^ 0x105a08ef check_rtl /home/meissner/fsf-src/gcc-4_9-lra/gcc/lra.c:2036 0x105a2bcb lra(_IO_FILE*) /home/meissner/fsf-src/gcc-4_9-lra/gcc/lra.c:2432 0x10552933 do_reload /home/meissner/fsf-src/gcc-4_9-lra/gcc/ira.c:4686 0x10552933 rest_of_handle_reload /home/meissner/fsf-src/gcc-4_9-lra/gcc/ira.c:4815 0x10552933 execute /home/meissner/fsf-src/gcc-4_9-lra/gcc/ira.c:4844 Please submit a full bug report, with preprocessed source if appropriate. Please include the complete backtrace with any bug report. See <http://gcc.gnu.org/bugs.html> for instructions. specmake: *** [module_diffusion_em.fppized.o] Error 1 specmake: *** Waiting for unfinished jobs.... Error with make 'specmake -j40 build': check file '/home/meissner/spec-build/spec-2006-base-dev49-power7-vsx-svn203459-lra-shared-at7.0-32bit/benchspec/CPU2006/481.wrf/build/build_base_dev49-power7-vsx-32bit.0000/make.err' Command returned exit code 2 Error with make! *** Error building 481.wrf I checked the LRA changes into a branch, and it is based off of subversion id 203569. svn+ssh://gcc.gnu.org/svn/gcc/branches/ibm/gcc-4_9-lra Lets see, in terms of make check regressions: Unexpected tests for gcc -m64: Test | gcc-4 #1 | trunk #2 =============================================== | ======== | ======== gcc.target/powerpc/p8vector-ldst.c | fail | --- gcc.target/powerpc/pr57744.c | fail | --- Unexpected tests for gcc -m32: Test | gcc-4 #1 | trunk #2 =============================================== | ======== | ======== c-c++-common/dfp/cast.c | fail | --- c-c++-common/dfp/convert-bfp-10.c | fail | --- c-c++-common/dfp/convert-bfp-11.c | fail | --- c-c++-common/dfp/convert-bfp-2.c | fail | --- c-c++-common/dfp/convert-bfp-3.c | fail | --- c-c++-common/dfp/convert-bfp-4.c | fail | --- c-c++-common/dfp/convert-bfp-5.c | fail | --- c-c++-common/dfp/convert-bfp-6.c | fail | --- c-c++-common/dfp/convert-bfp-7.c | fail | --- c-c++-common/dfp/convert-bfp.c | fail | --- c-c++-common/dfp/inf-1.c | fail | --- gcc.target/powerpc/bswap64-4.c | fail | --- gcc.target/powerpc/p8vector-ldst.c | fail | --- gcc.target/powerpc/pr53199.c | fail | --- Unexpected tests for g++ -m32: Test | gcc-4 #1 | trunk #2 ================================= | ======== | ======== c-c++-common/dfp/cast.c | fail | --- c-c++-common/dfp/convert-bfp-10.c | fail | --- c-c++-common/dfp/convert-bfp-11.c | fail | --- c-c++-common/dfp/convert-bfp-2.c | fail | --- c-c++-common/dfp/convert-bfp-3.c | fail | --- c-c++-common/dfp/convert-bfp-4.c | fail | --- c-c++-common/dfp/convert-bfp-5.c | fail | --- c-c++-common/dfp/convert-bfp-6.c | fail | --- c-c++-common/dfp/convert-bfp-7.c | fail | --- c-c++-common/dfp/convert-bfp.c | fail | --- c-c++-common/dfp/inf-1.c | fail | --- Unexpected tests for gfortran -m32: Test | gcc-4 #1 | trunk #2 =========================================================== | ======== | ======== gfortran.dg/PR19872.f | fail | --- gfortran.dg/advance_1.f90 | fail | --- gfortran.dg/advance_4.f90 | fail | --- gfortran.dg/advance_5.f90 | fail | --- gfortran.dg/advance_6.f90 | fail | --- gfortran.dg/append_1.f90 | fail | --- gfortran.dg/associated_2.f90 | fail | --- gfortran.dg/assumed_rank_1.f90 | fail | --- gfortran.dg/assumed_rank_2.f90 | fail | --- gfortran.dg/assumed_rank_7.f90 | fail | --- gfortran.dg/assumed_type_2.f90 | fail | --- gfortran.dg/backspace_10.f90 | fail | --- gfortran.dg/backspace_2.f | fail | --- gfortran.dg/backspace_8.f | fail | --- gfortran.dg/backspace_9.f | fail | --- gfortran.dg/bound_2.f90 | fail | --- gfortran.dg/bound_7.f90 | fail | --- gfortran.dg/char_cshift_1.f90 | fail | --- gfortran.dg/char_cshift_2.f90 | fail | --- gfortran.dg/char_cshift_3.f90 | fail | --- gfortran.dg/char_eoshift_1.f90 | fail | --- gfortran.dg/char_eoshift_2.f90 | fail | --- gfortran.dg/char_eoshift_3.f90 | fail | --- gfortran.dg/char_eoshift_4.f90 | fail | --- gfortran.dg/char_eoshift_5.f90 | fail | --- gfortran.dg/char_length_8.f90 | fail | --- gfortran.dg/chmod_1.f90 | fail | --- gfortran.dg/chmod_2.f90 | fail | --- gfortran.dg/chmod_3.f90 | fail | --- gfortran.dg/comma.f | fail | --- gfortran.dg/convert_2.f90 | fail | --- gfortran.dg/convert_implied_open.f90 | fail | --- gfortran.dg/cr_lf.f90 | fail | --- gfortran.dg/cshift_bounds_1.f90 | fail | --- gfortran.dg/cshift_bounds_2.f90 | fail | --- gfortran.dg/cshift_bounds_3.f90 | fail | --- gfortran.dg/cshift_bounds_4.f90 | fail | --- gfortran.dg/cshift_nan_1.f90 | fail | --- gfortran.dg/defined_assignment_9.f90 | fail | --- gfortran.dg/dev_null.F90 | fail | --- gfortran.dg/direct_io_1.f90 | fail | --- gfortran.dg/direct_io_11.f90 | fail | --- gfortran.dg/direct_io_12.f90 | fail | --- gfortran.dg/direct_io_2.f90 | fail | --- gfortran.dg/direct_io_3.f90 | fail | --- gfortran.dg/direct_io_5.f90 | fail | --- gfortran.dg/direct_io_8.f90 | fail | --- gfortran.dg/endfile.f90 | fail | --- gfortran.dg/endfile_2.f90 | fail | --- gfortran.dg/eof_4.f90 | fail | --- gfortran.dg/eoshift.f90 | fail | --- gfortran.dg/eoshift_bounds_1.f90 | fail | --- gfortran.dg/error_format.f90 | fail | --- gfortran.dg/f2003_inquire_1.f03 | fail | --- gfortran.dg/f2003_io_1.f03 | fail | --- gfortran.dg/f2003_io_5.f03 | fail | --- gfortran.dg/f2003_io_7.f03 | fail | --- gfortran.dg/fmt_cache_1.f | fail | --- gfortran.dg/fmt_error_4.f90 | fail | --- gfortran.dg/fmt_error_5.f90 | fail | --- gfortran.dg/fmt_t_5.f90 | fail | --- gfortran.dg/fmt_t_7.f | fail | --- gfortran.dg/ftell_3.f90 | fail | --- gfortran.dg/hollerith4.f90 | fail | --- gfortran.dg/inquire_10.f90 | fail | --- gfortran.dg/inquire_13.f90 | fail | --- gfortran.dg/inquire_15.f90 | fail | --- gfortran.dg/inquire_9.f90 | fail | --- gfortran.dg/inquire_size.f90 | fail | --- gfortran.dg/iomsg_1.f90 | fail | --- gfortran.dg/iostat_2.f90 | fail | --- gfortran.dg/list_read_10.f90 | fail | --- gfortran.dg/list_read_11.f90 | fail | --- gfortran.dg/list_read_6.f90 | fail | --- gfortran.dg/list_read_7.f90 | fail | --- gfortran.dg/list_read_9.f90 | fail | --- gfortran.dg/matmul_1.f90 | fail | --- gfortran.dg/matmul_5.f90 | fail | --- gfortran.dg/maxloc_bounds_1.f90 | fail | --- gfortran.dg/maxloc_bounds_2.f90 | fail | --- gfortran.dg/maxloc_bounds_3.f90 | fail | --- gfortran.dg/maxloc_bounds_6.f90 | fail | --- gfortran.dg/maxloc_bounds_8.f90 | fail | --- gfortran.dg/namelist_44.f90 | fail | --- gfortran.dg/namelist_45.f90 | fail | --- gfortran.dg/namelist_46.f90 | fail | --- gfortran.dg/namelist_66.f90 | fail | --- gfortran.dg/namelist_72.f | fail | --- gfortran.dg/namelist_82.f90 | fail | --- gfortran.dg/negative_automatic_size.f90 | fail | --- gfortran.dg/negative_unit.f | fail | --- gfortran.dg/negative_unit_int8.f | fail | --- gfortran.dg/newunit_1.f90 | fail | --- gfortran.dg/newunit_3.f90 | fail | --- gfortran.dg/open_access_append_1.f90 | fail | --- gfortran.dg/open_errors.f90 | fail | --- gfortran.dg/open_negative_unit_1.f90 | fail | --- gfortran.dg/open_new.f90 | fail | --- gfortran.dg/open_readonly_1.f90 | fail | --- gfortran.dg/open_status_1.f90 | fail | --- gfortran.dg/open_status_2.f90 | fail | --- gfortran.dg/open_status_3.f90 | fail | --- gfortran.dg/optional_dim_2.f90 | fail | --- gfortran.dg/optional_dim_3.f90 | fail | --- gfortran.dg/overwrite_1.f | fail | --- gfortran.dg/pointer_assign_8.f90 | fail | --- gfortran.dg/pr16597.f90 | fail | --- gfortran.dg/pr16935.f90 | fail | --- gfortran.dg/pr20954.f | fail | --- gfortran.dg/pr39865.f90 | fail | --- gfortran.dg/pr46804.f90 | fail | --- gfortran.dg/pr47878.f90 | fail | --- gfortran.dg/read_comma.f | fail | --- gfortran.dg/read_eof_4.f90 | fail | --- gfortran.dg/read_eof_8.f90 | fail | --- gfortran.dg/read_eof_all.f90 | fail | --- gfortran.dg/read_list_eof_1.f90 | fail | --- gfortran.dg/read_many_1.f | fail | --- gfortran.dg/read_no_eor.f90 | fail | --- gfortran.dg/readwrite_unf_direct_eor_1.f90 | fail | --- gfortran.dg/realloc_on_assign_11.f90 | fail | --- gfortran.dg/realloc_on_assign_7.f03 | fail | --- gfortran.dg/record_marker_1.f90 | fail | --- gfortran.dg/record_marker_3.f90 | fail | --- gfortran.dg/runtime_warning_1.f90 | fail | --- gfortran.dg/selected_char_kind_1.f90 | fail | --- gfortran.dg/selected_char_kind_4.f90 | fail | --- gfortran.dg/shift-alloc.f90 | fail | --- gfortran.dg/shift-kind_2.f90 | fail | --- gfortran.dg/stat_1.f90 | fail | --- gfortran.dg/stat_2.f90 | fail | --- gfortran.dg/streamio_1.f90 | fail | --- gfortran.dg/streamio_10.f90 | fail | --- gfortran.dg/streamio_12.f90 | fail | --- gfortran.dg/streamio_14.f90 | fail | --- gfortran.dg/streamio_15.f90 | fail | --- gfortran.dg/streamio_16.f90 | fail | --- gfortran.dg/streamio_2.f90 | fail | --- gfortran.dg/streamio_3.f90 | fail | --- gfortran.dg/streamio_4.f90 | fail | --- gfortran.dg/streamio_5.f90 | fail | --- gfortran.dg/streamio_6.f90 | fail | --- gfortran.dg/streamio_7.f90 | fail | --- gfortran.dg/streamio_8.f90 | fail | --- gfortran.dg/streamio_9.f90 | fail | --- gfortran.dg/tl_editing.f90 | fail | --- gfortran.dg/unf_io_convert_1.f90 | fail | --- gfortran.dg/unf_io_convert_2.f90 | fail | --- gfortran.dg/unf_io_convert_3.f90 | fail | --- gfortran.dg/unf_io_convert_4.f90 | fail | --- gfortran.dg/unf_read_corrupted_1.f90 | fail | --- gfortran.dg/unf_short_record_1.f90 | fail | --- gfortran.dg/unformatted_subrecord_1.f90 | fail | --- gfortran.dg/unpack_bounds_1.f90 | fail | --- gfortran.dg/unpack_bounds_2.f90 | fail | --- gfortran.dg/unpack_bounds_3.f90 | fail | --- gfortran.dg/widechar_intrinsics_10.f90 | fail | --- gfortran.dg/widechar_intrinsics_5.f90 | fail | --- gfortran.dg/write_back.f | fail | --- gfortran.dg/write_check.f90 | fail | --- gfortran.dg/write_check3.f90 | fail | --- gfortran.dg/write_direct_eor.f90 | fail | --- gfortran.dg/write_rewind_1.f | fail | --- gfortran.dg/write_rewind_2.f | fail | --- gfortran.dg/write_to_null.F90 | fail | --- gfortran.dg/x_slash_2.f | fail | --- gfortran.dg/zero_sized_1.f90 | fail | --- gfortran.fortran-torture/execute/backspace.f90 | fail | --- gfortran.fortran-torture/execute/direct_io.f90 | fail | --- gfortran.fortran-torture/execute/inquire_1.f90 | fail | --- gfortran.fortran-torture/execute/inquire_2.f90 | fail | --- gfortran.fortran-torture/execute/inquire_3.f90 | fail | --- gfortran.fortran-torture/execute/inquire_4.f90 | fail | --- gfortran.fortran-torture/execute/inquire_5.f90 | fail | --- gfortran.fortran-torture/execute/intrinsic_associated.f90 | fail | --- gfortran.fortran-torture/execute/intrinsic_associated_2.f90 | fail | --- gfortran.fortran-torture/execute/intrinsic_cshift.f90 | fail | --- gfortran.fortran-torture/execute/intrinsic_eoshift.f90 | fail | --- gfortran.fortran-torture/execute/intrinsic_size.f90 | fail | --- gfortran.fortran-torture/execute/list_read_1.f90 | fail | --- gfortran.fortran-torture/execute/open_replace.f90 | fail | --- gfortran.fortran-torture/execute/seq_io.f90 | fail | --- gfortran.fortran-torture/execute/slash_edit.f90 | fail | --- gfortran.fortran-torture/execute/unopened_unit_1.f90 | fail | --- -- Michael Meissner, IBM IBM, M/S 2506R, 550 King Street, Littleton, MA 01460, USA email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797