Re: cloog/isl version update for gcc 4.8
On Fri, Dec 21, 2012 at 3:49 PM, Jack Howarth wrote: > Tobi, > Can you update the isl and cloog tarballs in the gcc infrastructure > directory > to the new isl 0.11.1 and cloog 0.18.0 releases from... > > ftp://ftp.linux.student.kuleuven.be/pub/people/skimo/isl//isl-0.11.1.tar.bz2 > http://www.bastoul.net/cloog/pages/download/cloog-0.18.0.tar.gz > > It looks like config/isl.m4 needs to be modified to understand MAJOR, MINOR, > REVISION > for the isl 0.11.1 version numbering. Btw, using cloog 0.18.0 doesn't work in-tree because they no longer package include/cloog/version.h in the tarball (but only include/cloog/version.h.in). That would (finally) be a reason to move the configury to gcc/ where we then also can use link checks. Richard. > Index: configure.ac > === > --- configure.ac(revision 194661) > +++ configure.ac(working copy) > @@ -1607,7 +1607,7 @@ if test "x$with_isl" != "xno" && >dnl with user input. >ISL_INIT_FLAGS >dnl The minimal version of ISL required for Graphite. > - ISL_CHECK_VERSION(0,10) > + ISL_CHECK_VERSION(0,11,1) >dnl Only execute fail-action, if ISL has been requested. >ISL_IF_FAILED([ > AC_MSG_ERROR([Unable to find a usable ISL. See config.log for > details.])]) > @@ -1621,7 +1621,7 @@ if test "x$with_isl" != "xno" && > dnl > dnl If we use CLooG-Legacy, the provided version information is > dnl ignored. > -CLOOG_CHECK_VERSION(0,17,0) > +CLOOG_CHECK_VERSION(0,18,0) > > dnl Only execute fail-action, if CLooG has been requested. > CLOOG_IF_FAILED([ > Index: config/isl.m4 > === > --- config/isl.m4 (revision 194661) > +++ config/isl.m4 (working copy) > @@ -89,13 +89,13 @@ AC_DEFUN([ISL_REQUESTED], > ] > ) > > -# _ISL_CHECK_CT_PROG(MAJOR, MINOR) > +# _ISL_CHECK_CT_PROG(MAJOR, MINOR, REVISION) > # > # Helper for verifying ISL compile time version. > m4_define([_ISL_CHECK_CT_PROG],[AC_LANG_PROGRAM( >[#include > #include ], > - [if (strncmp (isl_version (), "isl-$1.$2", strlen ("isl-$1.$2")) != 0) > + [if (strncmp (isl_version (), "isl-$1.$2.$3", strlen ("isl-$1.$2.$3")) != > 0) > return 1; > ])]) > > @@ -115,9 +115,9 @@ AC_DEFUN([ISL_CHECK_VERSION], > LIBS="${_isl_saved_LIBS} -lisl" > echo $CFLAGS > > -AC_CACHE_CHECK([for version $1.$2 of ISL], > +AC_CACHE_CHECK([for version $1.$2.$3 of ISL], >[gcc_cv_isl], > - [AC_RUN_IFELSE([_ISL_CHECK_CT_PROG($1,$2)], > + [AC_RUN_IFELSE([_ISL_CHECK_CT_PROG($1,$2,$3)], > [gcc_cv_isl=yes], > [gcc_cv_isl=no], > [gcc_cv_isl=yes])]) > > seems to work fine. Thanks in advance. > Jack > >
Re: Adding Rounding Mode to Operations Opcodes in Gimple and RTL
> Yes, doing much related to rounding modes really requires making the > compiler respect them properly for -frounding-math. That's not quite > calls being optimization barriers in general, just for floating point. > > * General calls may set, clear or test exceptions, or manipulate the > rounding mode (as may asms, depending on their inputs / outputs / > clobbers). > > * Floating-point operations have the rounding mode as input. They may set > (but not clear or test) floating-point exception flags. > > * Thus in general floating-point operations may not be moved across most > calls (or relevant asms), or values from one side of a call reused for the > same operation with the same inputs appearing on the other side of the > call. > > * Statements such as "(void) (a * b);" can't be eliminated because they > may raise exceptions. (That's purely about exceptions, not rounding > modes.) I think we could need some fake variables to reflect current rounding mode/exception flags. These variables then should be updated in the statements you pointed above - i.e. we'll need to build def-use links for them basing on these statements. I was also thinking of another approach: adding some attribute to the call-stmt itself, but that could be difficult to be taken into account in the optimizations, working on def-use and not-iterating over every statement. As far as I understand, CCP is an example of such optimization. Is it correct or am I missing something? > Personally I'd think a natural starting point on the compiler side would > be to write a reasonably thorough and systematic testsuite for such > issues. That would cover all operations, for all floating-point types > (including ones such as __float128 and __float80), and conversions between > all pairs of floating-point types and either way between each > floating-point type and each integer type (including __int128 / unsigned > __int128), with operands being any of (constants, non-volatile variables > initialized with constants, volatile variables, vectors) and results being > (discarded, stored in non-volatile variables, stored in volatile > variables), in all the rounding modes, testing both results and exceptions > and confirming proper results when an operation is repeated after changes > of rounding mode or clearing exceptions. We mostly have problems when there is an 'interaction' between different rounding modes - so a ton of tests that checking correctness of a single operation in a specific rounding mode won't catch it. We could place all such tests in one file/function so that the compiler would transform it as it does now, so we'll catch the fail - but in this case we don't need many tests. So, generally I like the idea of having tests covering all the cases and then fixing them one-by-one, but I didn't catch what these tests would be except the ones from the trackers - it seems useless to have a bunch of tests, each of which contains a single operation and compares the result, even if we have a version of such test for all datatypes and rounding modes. For now I see one general problem (that calls aren't regarded as something that could change result of FP-operations) and it definitely needs a test, but I don't see any need in many tests here. --- Thanks, Michael On 10 January 2013 22:04, Joseph S. Myers wrote: > On Thu, 10 Jan 2013, Michael Zolotukhin wrote: > >> Thanks for the responses! >> I'll think about your warnings and decide whether I could afford such >> effort or not, but anyway, I wanted to start from GCC, not glibc. >> Am I getting it right, that before any other works we need to fix PR >> 34678 (that's correct number, thanks Mark!), making all passes take >> into account that calls could change rounding-modes/raise exceptions, >> i.e. make all calls optimization barriers? At least, when no >> 'aggressive' options were passed to the compiler. > > There are various overlapping bugs in Bugzilla for issues where > -frounding-math -ftrapping-math fail to implement all of FENV_ACCESS (I > think of exceptions and rounding modes support together, since they have > many of the same issues and both are covered by FENV_ACCESS, although it > may be possible to fix only a subset of the issues, e.g. just rounding > modes without exceptions). To what extent the issues really duplicate > each other isn't entirely clear; I'd advise looking at all the testcases > in all relevant PRs (both open (576 20785 27682 29186 30568 34678), and > others marked as duplicates of open ones), even if you then end up only > working on a subset of the problems. > > Yes, doing much related to rounding modes really requires making the > compiler respect them properly for -frounding-math. That's not quite > calls being optimization barriers in general, just for floating point. > > * General calls may set, clear or test exceptions, or manipulate the > rounding mode (as may asms, depending on their inputs / outputs / > clobbers). > > * Floating-point operations
Re: stabs support in binutils, gcc, and gdb
Doug Evans wrote: > On Thu, Jan 3, 2013 at 9:52 AM, nick clifton wrote: > >> Switching to DWARF causes our build products directory (which contains > >> *NONE* of the intermediate files) to swell from 1.2 GB to 11.5 GB. > >> Ouch! The DWARF ELF files are 8-12 times the size of the STABS ELF > >> files. > >> > >> If the DWARF files were, say, a factor of 2 the size of the STABS files, > >> I could probably sell people on switching to DWARF; but, a factor of 8 > >> to 12 is too much. > > > > > > Have you tried using a DWARF compression tool like dwz ? > > > > http://gcc.gnu.org/ml/gcc/2012-04/msg00686.html > > > > Or maybe the --compress-debug-sections option to objcopy ? > > Yeah, that would be really useful data to have. > > Plus, there's also -gdwarf-4 -fdebug-types-section. > > So while plain dwarf may be 8-12x of stabs, progress has been made, > and we shouldn't base decisions on incomplete analyses. > > If we had data to refute (or substantiate) claims that dwarf was > *still* X% larger than stabs and people were still avoiding dwarf > because of it, that would be really useful. > DWARF alone is more than 8-12 times larger than STABS alone. For our product, the DWARF elf file is 8-12 times larger than the STABS elf file. But, part of the file is the text + data + symbol table + various elf headers. So, the debugging information swelled by a larger factor. Some numbers. Picking d90a.elf because it is first alphabetically. {As to what d90f.elf is -- that's unimportant; but, it's the kernel for one of the boards in one of our hardware products.] With STABS, it's 83,945,437 bytes. If I strip it, it's 34,411,472 bytes. SIZE reports that the text is 26,073,758 bytes and that the data is 8,259,394 bytes, for a total of 34,333,152. So, the stipped size is 78,320 bytes larger than text+data. >From objdump: 77 .stab 01f40700 0208deb8 2**2 CONTENTS, READONLY, DEBUGGING 78 .stabstr 00e0b6bc 03fce5b8 2**0 CONTENTS, READONLY, DEBUGGING So, the two STABS sections come to a total of 47,496,636 bytes. (Stripped size 34,411,472) + (size of .stab & .stabstr) is 2,037,329 bytes shy of the unstriped size. Presumably symbols. DWARF 4 total file size 967,579,501 bytes. Ouch! Stripped 34,411,440 bytes. Which is 32 bytes smaller than the stabs case. Continuing... Adding up the various debugging sections I get 931,076,638 bytes for the .debug* sections 52,977 for the .stab and .stabstr sections (not sure where they came from -- maybe libgcc? Origin is unimportant for the present purposes.) Ignoring the 52,977 stabs stuff, that's 931076638 / 47496636 ~= 19.6. Using DWZ reduced the elf file size by approximately 1% when using dwarf 3 or dwarf 4. With dwarf 2 the file is about 10% bigger and dwz reduces it by about 10% -- i.e., to about the same file size as when using dwarf [34]. Using objcopy --compress-debug-sections reduced the overall elf file size to approximately 3.4 times that of the stabs file -- definitely better than the 11.5 ratio when not using it. Summarizing: STABS: total file size:83,945,437 text+data: 34,333,152 debugging: 47,496,636 other: 2,115,649 DWARF: total file size:967,579,501 text+data: 34,333,120 (don't know why it is 32 bytes smaller) DWARF debugging:931,076,638 STABS debugging: 52,977 other:2,116,766 file size ratio: 967,579,501 / 83,945,437 = 11.5 debug size ratio: 931,076,638 / 47,496,636 = 19.6 (It would actually be slightly worse if the remaining ~50K of STABS was converted to DWARF.) If I use objcopy --compress-debug-sections to compress the DWARF debug info (but don't use it on the STABS debug info), then the file size ratio is 3.4. While 3.4 is certainly better than 11.5, unless I can come up with a solution where the ratio is less than 2, I'm not currently planning on trying to convince them to switch to DWARF. David
Re: Adding Rounding Mode to Operations Opcodes in Gimple and RTL
On Fri, 11 Jan 2013, Michael Zolotukhin wrote: > > Personally I'd think a natural starting point on the compiler side would > > be to write a reasonably thorough and systematic testsuite for such > > issues. That would cover all operations, for all floating-point types > > (including ones such as __float128 and __float80), and conversions between > > all pairs of floating-point types and either way between each > > floating-point type and each integer type (including __int128 / unsigned > > __int128), with operands being any of (constants, non-volatile variables > > initialized with constants, volatile variables, vectors) and results being > > (discarded, stored in non-volatile variables, stored in volatile > > variables), in all the rounding modes, testing both results and exceptions > > and confirming proper results when an operation is repeated after changes > > of rounding mode or clearing exceptions. > > We mostly have problems when there is an 'interaction' between > different rounding modes - so a ton of tests that checking correctness > of a single operation in a specific rounding mode won't catch it. We > could place all such tests in one file/function so that the compiler > would transform it as it does now, so we'll catch the fail - but in > this case we don't need many tests. Tests should generally be small to make it easier for people to track down the failures. As you note, interactions are relevant - but that means tests would do an operation in one rounding mode, check results, repeat in another rounding mode, check results (which would catch the compiler wrongly reusing the first results), repeat again for each mode. Tests for each separate operation and type can still be separate. > So, generally I like the idea of having tests covering all the cases > and then fixing them one-by-one, but I didn't catch what these tests > would be except the ones from the trackers - it seems useless to have > a bunch of tests, each of which contains a single operation and > compares the result, even if we have a version of such test for all > datatypes and rounding modes. I'm thinking in terms of full FENV_ACCESS test coverage, for both exceptions and rounding modes, where there are many more things that can go wrong for single operations (such as the operation being wrongly discarded because the result isn't used, even though the exceptions are tested, or a libgcc implementation of a function raising excess exceptions). But even just for rounding modes, there are still various uses for systematically covering different permutations. * Tests should test both building -frounding-math, without the FENV_ACCESS pragma, and with the pragma but without that option, when the pragma is implemented. * There's clearly some risk that implementations of __float128 using soft-fp have bugs in how they interact with hardware exceptions and rounding modes. These are part of libgcc; there should be test coverage for such issues to provide confidence that GCC is handling exceptions and rounding modes correctly. This also helps detect soft-fp bugs generally. * Some architectures may well have rounding mode bugs in operations defined in their .md files. E.g., conversion of integer 0 to floating-point in round-downwards mode on older 32-bit powerpc wrongly produces -0.0 instead of +0.0. One purpose of tests for an issue with significant machine dependencies is to allow people testing on an architecture other than that originally used to develop the feature to tell whether there are architecture-specific bugs. There are reasonably thorough tests of conversions between floating-point and integers (gcc.dg/torture/fp-int-convert-*) in the testsuite, which caught several bugs when added (especially as regards conversions to/from TImode), and sometimes continue to do so - but only cover round-to-nearest. * Maybe a .md file wrongly enables vector operations without -ffast-math even though they do not handle all floating-point cases correctly. Since this is a case where a risk of problems is reasonably predictable (it's common for processors to define vector instructions in ways that do not have the full IEEE semantics with rounding modes, exceptions, subnormals etc., which means they shouldn't be used for vectorization on such processors without appropriate -ffast-math options), verifying that vector operations (GNU C generic vectors) handle floating-point correctly is also desirable. Thus, while adding testcases from specific bugs would ensure that those very specific tests remained fixed, I don't think it would provide much confidence that the overall FENV_ACCESS implementation is at all reliable, only that a limited subset of bugs that people had actually reported had been fixed (especially, areas such as conversions from TImode to float, that people less frequently use, would be at high risk of remaining bugs), and it wouldn't be of much use for someone trying to do t
Re: stabs support in binutils, gcc, and gdb
On Fri, Jan 11, 2013 at 6:52 AM, David Taylor wrote: > Doug Evans wrote: >> So while plain dwarf may be 8-12x of stabs, progress has been made, >> and we shouldn't base decisions on incomplete analyses. > > ... > > If I use objcopy --compress-debug-sections to compress the DWARF debug > info (but don't use it on the STABS debug info), then the file size > ratio is 3.4. > > While 3.4 is certainly better than 11.5, unless I can come up with a > solution where the ratio is less than 2, I'm not currently planning on > trying to convince them to switch to DWARF. The 3.4 number is the number I was interested in. Thanks for computing it. There are other things that can reduce the amount of dwarf, but the size reduction can depend on the app of course. I'm thinking of dwz and .debug_types. I wonder what 3.4 changes to with those applied.
gcc-4.6-20130111 is now available
Snapshot gcc-4.6-20130111 is now available on ftp://gcc.gnu.org/pub/gcc/snapshots/4.6-20130111/ and on various mirrors, see http://gcc.gnu.org/mirrors.html for details. This snapshot has been generated from the GCC 4.6 SVN branch with the following options: svn://gcc.gnu.org/svn/gcc/branches/gcc-4_6-branch revision 195115 You'll find: gcc-4.6-20130111.tar.bz2 Complete GCC MD5=ab53d89e99340f2bf85f27b8d9480f43 SHA1=8d5bcd9f0a58ebd6b15caf52f2845768884511ae Diffs from 4.6-20130104 are available in the diffs/ subdirectory. When a particular snapshot is ready for public consumption the LATEST-4.6 link is updated and a message is sent to the gcc list. Please do not use a snapshot before it has been announced that way.
Re: stabs support in binutils, gcc, and gdb
>> If I use objcopy --compress-debug-sections to compress the DWARF debug >> info (but don't use it on the STABS debug info), then the file size >> ratio is 3.4. >> >> While 3.4 is certainly better than 11.5, unless I can come up with a >> solution where the ratio is less than 2, I'm not currently planning on >> trying to convince them to switch to DWARF. > > The 3.4 number is the number I was interested in. > Thanks for computing it. It's not really fair to compare compressed DWARF with uncompressed stabs, is it? > There are other things that can reduce the amount of dwarf, but the > size reduction can depend on the app of course. > I'm thinking of dwz and .debug_types. > I wonder what 3.4 changes to with those applied. David already said that dwz didn't help much, so that implies that .debug_types won't help much either -- dwz should have removed any duplicate type information that .debug_types would have removed. I'm not going to argue that a ratio of 11.5 isn't kind of embarrassing for DWARF, but I'd like to point out that you're not making an apples-to-apples comparison. DWARF expresses a lot more about what's going on in your program than stabs does, and it's reasonable to expect it to be larger as a result. I compiled a very small C++ source file with nothing more than a simple class definition and a main that instantiates an instance of the class. Compiled with stabs, the .o file is 3552 bytes with 1843 bytes of stabs info. Compiled with DWARF-4, the .o file is 3576 bytes with 668 bytes of DWARF. For this file, the two formats are encoding roughly the same information, and DWARF is actually more efficient. Next, I compiled a 5000-line C++ source file at both -O0 and -O2. Here's the comparison at -O0: stabs: 2,179,240 total 562,931 debug dwarf: 4,624,816 total (2.1x) 1,965,448 debug (3.5x) And at -O2: stabs: 1,249,552 total 511,957 debug dwarf: 4,612,240 total (3.7x) 2,281,564 debug (4.5x) In general, DWARF is describing more about where variables live as they move around during program execution. There's been lots of recent work improving GCC's support for debugging optimized code, and that's expensive to describe. Notice that when we turn on -O2, we get a lot more DWARF information, while the stabs info is actually a bit smaller (probably because -O2 generates less code). Even at -O0, DWARF is describing more than stabs is. I didn't see anything close to the 11.5 ratio that David got, so I'm not sure what's so exceptional about your case. I'd be happy to take a look if you can get me the files somehow. We're working hard at improving the efficiency of DWARF -- there's a lot of places where it can be improved, but I doubt the ratio between stabs and DWARF will ever be much lower than ~3x, simply because there's so much more information contained in the DWARF. That extra information leads to a better debugging experience, but it's a tradeoff. If stabs gives you a good-enough experience and the size of DWARF is unbearable for you, then there's no reason to switch. -cary
Re: stabs support in binutils, gcc, and gdb
On Fri, Jan 11, 2013 at 5:55 PM, Cary Coutant wrote: > > Next, I compiled a 5000-line C++ source file at both -O0 and -O2. I have to assume that David is working with C code, as stabs debugging for C++ is nearly unusable. Ian
Re: microblaze unroll loops optimization
Hi all, I believe the decision to use UNSPEC_CMP and UNSPEC_CMPU for microblaze compare instructions stems from the conversation in this thread from 2009; http://gcc.gnu.org/ml/gcc/2009-12/msg00283.html This makes sense, because if I use code attributes and iterators to extend the compare insn as so; (define_code_iterator any_cmp [gt ge lt le gtu geu ltu leu]) ;; expands to an empty string when doing a signed operation and ;; "u" when doing an unsigned operation. (define_code_attr u [(gt "") (gtu "u") (ge "") (geu "u") (lt "") (ltu "u") (le "") (leu "u")]) (define_insn "cmp_" [(set (match_operand:SI 0 "register_operand" "=d") (any_cmp:SI (match_operand:SI 1 "register_operand") (match_operand:SI 2 "register_operand")))] "" "cmp\t%0,%1,%2" [(set_attr "type" "arith") (set_attr "mode" "SI") (set_attr "length" "4")]) They look ok; (insn 29 27 30 5 (set (reg:SI 67) (gt:SI (reg/v:SI 55 [ N ]) (reg/v:SI 51 [ j ]))) ../core_matrix_min.c:6 63 {cmp_gt} (nil)) (jump_insn 30 29 31 5 (set (pc) (if_then_else (lt:SI (reg:SI 67) (const_int 0 [0])) (label_ref 28) (pc))) ../core_matrix_min.c:6 72 {branch_zero} (expr_list:REG_BR_PROB (const_int 9100 [0x238c]) (nil)) -> 28) and code generated at -O0 is correct. But then I see that during CSE optimization, the instructions are trashed; starting the processing of deferred insns deleting insn with uid = 16. deleting insn with uid = 17. deleting insn with uid = 28. deleting insn with uid = 29. deleting insn with uid = 30. Looking back at the gcc 4.1.2 implementation, we had a branch_compare which carried out the compare and branching in one instruction; (define_insn "branch_compare" [(set (pc) (if_then_else (match_operator:SI 0 "cmp_op" [(match_operand:SI 1 "register_operand" "d") (match_operand:SI 2 "register_operand" "d") ]) (match_operand:SI 3 "pc_or_label_operand" "") (match_operand:SI 4 "pc_or_label_operand" ""))) (clobber(reg:SI R_TMP))] "" { if (operands[3] != pc_rtx) { /* normal jump */ switch (GET_CODE (operands[0])) { case GT: return "cmp\tr18,%z1,%z2\;blti%?\tr18,%3 #GT"; case LE: return "cmp\tr18,%z1,%z2\;bgei%?\tr18,%3"; case GE: return "cmp\tr18,%z2,%z1\;bgei%?\tr18,%3"; case LT: return "cmp\tr18,%z2,%z1\;blti%?\tr18,%3 #LT"; case GTU:return "cmpu\tr18,%z1,%z2\;blti%?\tr18,%3 #GTU"; case LEU:return "cmpu\tr18,%z1,%z2\;bgei%?\tr18,%3"; case GEU:return "cmpu\tr18,%z2,%z1\;bgei%?\tr18,%3"; case LTU:return "cmpu\tr18,%z2,%z1\;blti%?\tr18,%3 #LTU"; Is this method still valid and considered acceptable in a modern gcc? thanks, David On 8 January 2013 15:59, David Holsgrove wrote: > Loop unrolling (-funroll-loops) for microblaze is ineffectual on the gcc > 4.6/4.7/4.8 branches. > > This previously worked on an out of tree gcc 4.1.2, and I believe the relevant > diff to be the use of UNSPEC_CMP and UNSPEC_CMPU to create two unique > instructions for signed_compare and unsigned_compare in microblaze's machine > description, which means that the iv_analyze_expr in loop-iv.c of the compare > instruction is unable to understand the expression. > > Details follow below, > > thanks, > David >
Re: microblaze unroll loops optimization
On 01/11/2013 06:53 PM, David Holsgrove wrote: Hi all, I believe the decision to use UNSPEC_CMP and UNSPEC_CMPU for microblaze compare instructions stems from the conversation in this thread from 2009; http://gcc.gnu.org/ml/gcc/2009-12/msg00283.html Thanks for reminding me. I'd forgotten that Paolo suggested using UNSPEC. I still think it's a bit odd. Other targets use the comparison operator (e.g., lt, ge, etc.). Microblaze should as well. This makes sense, because if I use code attributes and iterators to extend the compare insn as so; (define_code_iterator any_cmp [gt ge lt le gtu geu ltu leu]) Using UNSPEC and code interators should be unrelated. -- Michael Eagerea...@eagercon.com 1960 Park Blvd., Palo Alto, CA 94306 650-325-8077