Re: Volatile qualification on pointer and data
On 21/09/2011 16:57, Paulo J. Matos wrote: On 21/09/11 15:21, David Brown wrote: And since this situation would not occur in real code (at least, not code that is expected to do something useful other than test the compiler's code generation), there is no harm in making sub-optimal object code. Actually the reason why I noticed this is because one of our engineers told that GCC stopped generating instructions for certain operations when he moved from GCC45 to GCC46. This code is real code. Cheers, If you really have a "static const" object which you need to read as "volatile" for some reason, then I would seriously consider changing the code. With "static const" you are telling the compiler it knows everything about the use of that object, and its value will never change - with "volatile" you are telling it that it's value might change behind the scenes. Obviously you've only posted a code snippet and not your full code, but that sounds self-contradictory to me. Somewhere along the line you are lying to the compiler - that's never a good idea when you want correct and optimal code. David
Re: Volatile qualification on pointer and data
On 21/09/2011 20:50, Georg-Johann Lay wrote: David Brown schrieb: On 21/09/2011 15:57, Ian Lance Taylor wrote: David Brown writes: On 21/09/2011 10:21, Paulo J. Matos wrote: On 21/09/11 08:03, David Brown wrote: Asking to read it by a volatile read does not change the nature of "foo" - the compiler can still implement it as a compile-time constant. But since I am accessing the data through the pointer and the pointer qualifies the data as volatile, shouldn't the compiler avoid this kind of optimization for reads through the pointer? My thought is that the nature of "foo" is independent of how it is accessed. On the other hand, some uses of a variable will affect its implementation - if you take the address of "foo" and pass that on to an external function or data, then the compiler would have to generate "foo" in memory (but in read-only memory, and it can still assume its value does not change). So I am not sure what the "correct" behaviour is here - I merely ask the question. Fortunately, this situation is not going to occur in real code. I think your description is supported by the standard. However, I also think that gcc should endeavor to fully honor the volatile qualifier in all cases, because that is least surprising to the programmer. This is not a case where we should let optimization override the programmer's desire; by using volatile, the programmer has explicitly told us that they do not want any optimization to occur. ACK. That makes sense - the principle of least surprise. And since this situation would not occur in real code (at least, not code that is expected to do something useful other than test the compiler's code generation), there is no harm in making sub-optimal object code. Are there any warning flags for "programmer doing something technically legal but logically daft", that could be triggered by such cases? :-) The combination of const and volatile can be reasonable in real world code. One example is a special function register (SFR) that is read-only but can be altered by hardware. That is /very/ different - you are talking about an "extern volatile const uint8_t readOnlySFR" declaration, or something effectively like: #define readOnlySFR (*(volatile const uint8_t *) 0x1234) Either way, what you are telling the compiler is that this item is "volatile", and may change it's value unbeknownst to the compiler, and that is "const", meaning that /your/ code may not change its value. That's fine, and consistent. It might sound strange at first - many people think "const" means the data is constant and cannot change, when in fact C has its one peculiar meaning for "const". What can't make sense is a /static/ "volatile const" which is /defined/ locally, rather than just declared. Second example is a lookup table that can be changed after building the software, e.g. you have some calibration data that has to be drawn from the environment (drift of sensors, inductivity of motor windings, offset of actors, etc). In such a case you want to read the data from the lookup table in, say, .rodata. By no means you want the compiler to insert/propagate known values from the lookup table to immediate operands in instructions. That's exacly what "const volatile" does. Again, you are talking about a "volatile const" declaration of data that is defined externally, and that's okay. Also note that in this case the local static const data is not volatile - it is only accessed as a volatile through a pointer cast.
I cannot disable GCC TLS support thoroughly.
Hello, I configured my gcc with "--disable-tls" for arm-none-eabi. But it can still successfully compile the below case: __thread int i; int f (void) { return i; } void main (int j) { i = j; } The "dg-require-effective-target tls" use this case to check whether target supports tls. So how to configure GCC to let it fail to compile this case, and then let the dg test framework thinks the tls is unsupported? Thanks in advance. BR, Terry
地
Vertebrate fertile bluff me to ford, as gait did something which. This aristocracy Rakitin lamented to puffer in a disperse. Clove accomplished poetic, looking kindly invasion with the efficacy of persons bracelet festered some regardless fleshy dogma whisker had joyous herd of keeping realised, later boy were mildly amassed by nikolay parfenovitch drench. He was avenging rival on me as a Karamazov, I celebrate that astray. ナに Ms. White
Re: [PLUGIN] Fix PLUGIN_FINISH_TYPE
Romain Geissler a écrit: > I tried to fix PLUGIN_FINISH_DECL as well to include typedefs in C++. > > The followings does not currently trigger the PLUGIN_FINISH_DECL > (or not in all cases), but should them ? > - function parameters (in the function prototype) > - definition (with a function body) of a top-level function (while the exact >same function definition enclosed in a class definition will trigger >PLUGIN_FINISH_DECL) > - label declaration > - constants defined by enums > - namespace Indeed. finish_decl is not called in those cases. As to if the PLUGIN_FINISH_DECL event should be emitted for those, I'd say yes, at least if I believe what the description in plugin.def says: /* After finishing parsing a declaration. */ DEFEVENT (PLUGIN_FINISH_DECL) But I'd rather ask what the maintainers think about it. Jason, Diego? -- Dodji
Re: [PLUGIN] Fix PLUGIN_FINISH_TYPE
On 11-09-22 09:40 , Dodji Seketeli wrote: Romain Geissler a écrit: I tried to fix PLUGIN_FINISH_DECL as well to include typedefs in C++. The followings does not currently trigger the PLUGIN_FINISH_DECL (or not in all cases), but should them ? - function parameters (in the function prototype) - definition (with a function body) of a top-level function (while the exact same function definition enclosed in a class definition will trigger PLUGIN_FINISH_DECL) - label declaration - constants defined by enums - namespace Indeed. finish_decl is not called in those cases. As to if the PLUGIN_FINISH_DECL event should be emitted for those, I'd say yes, at least if I believe what the description in plugin.def says: /* After finishing parsing a declaration. */ DEFEVENT (PLUGIN_FINISH_DECL) But I'd rather ask what the maintainers think about it. Jason, Diego? Yes, those events should trigger a PLUGIN_FINISH_DECL call. Diego.
Re: [PLUGIN] Fix PLUGIN_FINISH_TYPE
Le 22 sept. 2011 à 16:18, Diego Novillo a écrit : > On 11-09-22 09:40 , Dodji Seketeli wrote: >> Romain Geissler a écrit: >> >>> I tried to fix PLUGIN_FINISH_DECL as well to include typedefs in C++. >>> >>> The followings does not currently trigger the PLUGIN_FINISH_DECL >>> (or not in all cases), but should them ? >>> - function parameters (in the function prototype) >>> - definition (with a function body) of a top-level function (while the >>> exact >>>same function definition enclosed in a class definition will trigger >>>PLUGIN_FINISH_DECL) >>> - label declaration >>> - constants defined by enums >>> - namespace >> >> Indeed. finish_decl is not called in those cases. As to if the >> PLUGIN_FINISH_DECL event should be emitted for those, I'd say yes, at >> least if I believe what the description in plugin.def says: >> >> /* After finishing parsing a declaration. */ >> DEFEVENT (PLUGIN_FINISH_DECL) >> >> But I'd rather ask what the maintainers think about it. >> >> Jason, Diego? > > Yes, those events should trigger a PLUGIN_FINISH_DECL call. Ok, i've already implemented it in the C front-end. I'll post the whole patch soon. Romain
Re: PowerPC shrink-wrap support 0 of 3
On Thu, Sep 22, 2011 at 12:58:51AM +0930, Alan Modra wrote: > I spent a little time today looking at why shrink wrap is failing to > help on PowerPC, and it turns out that the optimization simply doesn't > trigger that often due to prologue clobbered regs. PowerPC uses r0 as > a temp in the prologue to save LR to the stack, and unfortunately r0 > seems to often be live across the candidate edge chosen for > shrink-wrapping, ie. where the prologue will be inserted. I suppose > it's no surprise that r0 is often live; rs6000.h:REG_ALLOC_ORDER makes > r0 the first gpr to be used. > > As a quick hack, I'm going to try a different REG_ALLOC_ORDER but I > suspect the real fix will require register renaming in the prologue. Hi Bernd, Rearranging the rs6000 register allocation order did in fact help a lot as far as making more opportunities available for shrink-wrap. So did your http://gcc.gnu.org/ml/gcc-patches/2011-03/msg01499.html patch. The two together worked so well that gcc won't bootstrap now.. The problem is that shrink wrapping followed by basic block reordering breaks dwarf unwind info, triggering "internal compiler error: in maybe_record_trace_start at dwarf2cfi.c:2243". From your emails on the list, I gather you've seen this yourself. The bootstrap breakage happens on libmudflap/mf-hooks1.c, compiling __wrap_malloc. Eliding some detail, this function starts off as void *__wrap_malloc (size_t c) { if (__mf_starting_p) return __real_malloc (c); The "if" is bb2, the sibling call bb3, and shrink wrap rather nicely puts the prologue for the rest of the function in bb4. A great example of shrink wrap doing as it should, if you ignore the fact that optimizing for startup isn't so clever. However, bb-reorder inverts the "if" and moves the sibling call past other blocks in the function. That's wrong, because the dwarf unwind info for the prologue is not applicable for the sibling call block: The prologue hasn't been executed for that block. (The unwinder sequentially executes all unwind opcodes from the start of the function to find the unwind state at any instruction address.) Exactly the same sort of problem is generated by your "unconverted_simple_returns" code. What should I do here? bb-reorder could be disabled for these blocks, but that won't help unconverted_simple_returns. I'm willing to spend some time fixing this, but don't want to start if you already have partial or full solutions. Another thing I'd like to work on is stopping ifcvt transformations from killing shrink wrap opportunities. We have one in CPU2006 povray Ray_In_Bound that ought to give 5% (figure from shrink wrap by hand), but currently only gets shrink wrapping there with -fno-if-conversion. -- Alan Modra Australia Development Lab, IBM
Re: PowerPC shrink-wrap support 0 of 3
On 09/22/11 16:40, Alan Modra wrote: > The bootstrap breakage happens on libmudflap/mf-hooks1.c, compiling > __wrap_malloc. Eliding some detail, this function starts off as > > void *__wrap_malloc (size_t c) > { > if (__mf_starting_p) > return __real_malloc (c); > > The "if" is bb2, the sibling call bb3, and shrink wrap rather nicely > puts the prologue for the rest of the function in bb4. A great > example of shrink wrap doing as it should, if you ignore the fact that > optimizing for startup isn't so clever. However, bb-reorder inverts > the "if" and moves the sibling call past other blocks in the function. > That's wrong, because the dwarf unwind info for the prologue is not > applicable for the sibling call block: The prologue hasn't been > executed for that block. (The unwinder sequentially executes all > unwind opcodes from the start of the function to find the unwind state > at any instruction address.) Exactly the same sort of problem is > generated by your "unconverted_simple_returns" code. dwarf2cfi should be able to figure this out. I'd need to see RTL dumps to get an idea what's going on. Bernd
Re: I cannot disable GCC TLS support thoroughly.
Terry Guo writes: > I configured my gcc with "--disable-tls" for arm-none-eabi. But it can > still successfully compile the below case: > > __thread int i; > int f (void) { return i; } > void main (int j) { i = j; } > > The "dg-require-effective-target tls" use this case to check whether > target supports tls. So how to configure GCC to let it fail to compile > this case, and then let the dg test framework thinks the tls is > unsupported? Thanks in advance. When the assembler and/or linker do not support TLS, gcc will emulate it using pthread_key_create. So as far as I know there is no way to thoroughly disable it. Ian
Use of FLAGS_REGNUM clashes with generates insn
Hi, After the discussion about the use of CCmode in: http://gcc.gnu.org/ml/gcc/2011-07/msg00303.html I am trying to ditch support for the only cc0 attr and add support for CC_REG. There are two issues that are making the situation more complicated, both of similar nature. My addition instruction sets all the flags. So I have: (define_insn "addqi3" [(set (match_operand:QI 0 "nonimmediate_operand" "=c") (plus:QI (match_operand:QI 1 "nonmemory_operand" "%0") (match_operand:QI 2 "general_operand" "cwmi"))) (clobber (reg:CC RCC))] "" "add\\t%0,%f2") (define_insn "*addqi3_flags" [(set (match_operand:QI 0 "nonimmediate_operand" "=c") (plus:QI (match_operand:QI 1 "nonmemory_operand" "%0") (match_operand:QI 2 "general_operand" "cwmi"))) (set (reg RCC) (compare (plus:QI (match_dup 1) (match_dup 2)) (const_int 0)))] "reload_completed && xap_match_ccmode(insn, CCmode)" "add\\t%0,%f2") There's however a problem with this. GCC during reload, after register elimination (eliminating argument pointer for an offset from the stack pointer tries to output the instruction): (set (reg ...) (plus:QI (reg ...) (const_int ...))) However, it spills and fails because no rule matches this expression (it's missing the clobber). I fixed this with: (define_insn_and_split "addqi3_noclobber" [(set (match_operand:QI 0 "register_operand" "=c") (plus:QI (match_operand:QI 1 "register_operand") (match_operand:QI 2 "immediate_operand")))] "reload_in_progress" "#" "reload_completed" [(set (match_dup 0) (match_dup 1)) (parallel [(set (match_dup 0) (plus:QI (match_dup 0) (match_dup 2))) (clobber (reg:CC RCC))])]) And it works. However, the more complex issue comes with register moves. A register move which ends up as a load or store, sets flags N, Z. I have an expand movqi which expand to a move with the clobber like so: (define_expand "movqi" [(parallel [(set (match_operand 0 "nonimmediate_operand") (match_operand 1 "general_operand")) (clobber (reg:CC RCC))])] "" { /* One of the ops has to be in a register. */ if (!register_operand(operands[0], QImode) && ! (register_operand(operands[1], QImode) || const0_rtx == operands[1])) operands[1] = copy_to_mode_reg(QImode, operands[1]); }) And all my (define_insn "*mov..." are tagged with a (clobber (reg:CC RCC)). This generates all kinds of trouble since GCC generates moves internally without the clobber that fail to match. I tried the same trick as above: (define_insn_and_split "*movqi_noclobber" [(set (match_operand:QI 0 "nonimmediate_operand") (match_operand:QI 1 "general_operand"))] "!reload_completed" "#" "" [(parallel [(set (match_dup 0) (match_dup 1)) (clobber (reg:CC RCC))])]) This doesn't fix the problem. It actually brings an internal compiler error. I am definitely not doing this the right way. Any suggestions on how to correctly handle these? Cheers, -- PMatos
Re: PowerPC shrink-wrap support 0 of 3
On 09/22/2011 07:47 AM, Bernd Schmidt wrote: > dwarf2cfi should be able to figure this out. I'd need to see RTL dumps > to get an idea what's going on. Indeed. Please CC me, Alan. r~
Re: GCC 4.7.0 Status Report (2011-09-09)
On 09/09/11 09:09:30, Jakub Jelinek wrote: > [...] What is the status of lra, reload-2a, pph, > cilkplus, gupc (I assume at least some of these are 4.8+ material)? For GUPC, we are targeting GCC 4.8. thanks, - Gary
Re: Volatile qualification on pointer and data
On Wed, Sep 21, 2011 at 4:57 PM, Paulo J. Matos wrote: > On 21/09/11 15:21, David Brown wrote: >> >> And since this >> situation would not occur in real code (at least, not code that is >> expected to do something useful other than test the compiler's code >> generation), there is no harm in making sub-optimal object code. >> > > Actually the reason why I noticed this is because one of our engineers told > that GCC stopped generating instructions for certain operations when he > moved from GCC45 to GCC46. This code is real code. Btw, I think this is an old bug that has been resolved. Did you make sure to test a recent 4.6 branch snapshot or svn head? > Cheers, > -- > PMatos > >
gcc-4.5-20110922 is now available
Snapshot gcc-4.5-20110922 is now available on ftp://gcc.gnu.org/pub/gcc/snapshots/4.5-20110922/ and on various mirrors, see http://gcc.gnu.org/mirrors.html for details. This snapshot has been generated from the GCC 4.5 SVN branch with the following options: svn://gcc.gnu.org/svn/gcc/branches/gcc-4_5-branch revision 179103 You'll find: gcc-4.5-20110922.tar.bz2 Complete GCC MD5=f733dae8d28bb3f6159a0a39baa59647 SHA1=ca98324713ccb3b1cae2d4de7fdebef9aa7c2090 Diffs from 4.5-20110915 are available in the diffs/ subdirectory. When a particular snapshot is ready for public consumption the LATEST-4.5 link is updated and a message is sent to the gcc list. Please do not use a snapshot before it has been announced that way.
Re: RFC: Improving support for known testsuite failures
On Thu, 8 Sep 2011, Diego Novillo wrote: > On Thu, Sep 8, 2011 at 04:31, Richard Guenther > wrote: > > > I think it would be more useful to have a script parse gcc-testresults@ > > postings from the various autotesters and produce a nice webpage > > with revisions and known FAIL/XPASSes for the target triplets that > > are tested. > > Sure, though that describes a different tool. I'm after a tool that > will 'exit 0' if the testsuite finished with nominal results. Not to stop you from (partly) reinventing the wheel, but that's pretty much what contrib/regression/btest-gcc.sh already does, though you have to feed it a baseline a set of processed .sum files which could (for a calling script or a modified btest-gcc.sh) live in, say, contrib/target-results/. It handles "duplicate" test names by marking it as failing if any of them has failed. Works good enough. brgds, H-P
RFC: DWARF Extensions for Separate Debug Info Files ("Fission")
At Google, we've found that the cost of linking applications with debug info is much too high. A large C++ application that might be, say, 200MB without debug info, is somewhere around 1GB with debug info, and the total size of the object files that we send to the linker is around 5GB (and that's with compressed debug sections). We've come to the conclusion that the most promising solution is to eliminate the debug info from the link step. I've had direct experience with HP's approach to this, and I've looked into Sun's and Apple's approaches, but none of those three approaches actually separates the debug info from the non-debug info at the object file (.o) level. I know we're not alone in having concerns about the size of debug info, so we've developed the following proposal to extend the DWARF format and produce separate .o and ".dwo" (DWARF object) files at the compilation step. Our plan is to develop the gcc and gdb changes on new upstream branches. After we get the basics working and have some results to show (assuming it all works out and proves worthwhile), I'll develop this into a formal proposal to the DWARF committee. I've also posted this proposal on the GCC wiki: http://gcc.gnu.org/wiki/DebugFission We've named the project "Fission." I'd appreciate any comments. -cary DWARF Extensions for Separate Debug Information Files September 22, 2011 Problems with Size of the Debug Information === Large applications compiled with debug information experience slow link times, possible out-of-memory conditions at link time, and slow gdb startup times. In addition, they can contribute to significant increases in storage requirements, and additional network latency when transferring files in a distributed build environment. * Out-of-memory conditions: When the total size of the input files is large, the linker may exceed its total memory allocation during the link and may get killed by the operating system. As a rule of thumb, the link job total memory requirements can be estimated at about 200% of the total size of its input files. * Slow link times: Link times can be frustrating when recompiling only a small source file or two. Link times may be aggravated when linking on a machine that has insufficient RAM, resulting in excessive page thrashing. * Slow gdb startup times: The debugger today performs a partial scan of the debug information in order to build its internal tables that allow it to map names and addresses to the debug information. This partial scan was designed to improve startup performance, and avoids a full scan of the debug information, but for large applications, it can still take a minute or more before the debugger is ready for the first command. The debugger now has the ability to save a ".gdb_index" section in the executable and the gold linker now supports a --gdb-index option to build this index at link time, but both of these options still require the initial partial scan of the debug information. These conditions are largely a direct result of the amount of debug information generated by the compiler. In a large C++ application compiled with -O2 and -g, the debug information accounts for 87% of the total size of the object files sent as inputs to the link step, and 84% of the total size of the output binary. Recently, the -Wa,--compress-debug-sections option has been made available. This option reduces the total size of the object files sent to the linker by more than a third, so that the debug information now accounts for 70-80% of the total size of the object files. The output file is unaffected: the linker decompresses the debug information in order to link it, and outputs the uncompressed result (there is an option to recompress the debug information at link time, but this step would only reduce the size of the output file without improving link time or memory usage). What's All That Space Being Used For? = The debugging information in the relocatable object files sent to the linker consists of a number of separate tables (percentages are for uncompressed debug information relative to the total object file size): * Debug Information Entries - .debug_info (11%): This table contains the debug info for subprograms and variables defined in the program, and many of the trivial types used. * Type Units - .debug_types (12%): This table contains the debug info for most of the non-trivial types (e.g., structs and classes, enums, typedefs), keyed by a hashed type signature so that duplicate type definitions can be eliminated by the linker. During the link, about 85% of this data is discarded as duplicate. These sections have the same structure as the .debug_info sections. * Strings - .debug_str (25%): This table contains strings that are not placed inline in the .debug_info and .debug_types sections. The linker merges the string ta
Re: RFC: DWARF Extensions for Separate Debug Info Files ("Fission")
Hi Cary, just one quick clarification - On Sep 22, 2011, at 5:21 PM, Cary Coutant wrote: > Previous Implementations of Separate Debug Information > == > > In the Sun and HP implementations, the debug information in the > relocatable objects still requires relocation at debug time, and > the debugger must read the summary information from the > executable file in order to map symbols and sections to the > output file when processing and applying the relocations. The > Apple implementation avoids this cost at debug time, but at the > cost of having a separate link step for the debug information. The Apple approach has both the features of the Sun/HP implementation as well as the ability to create a standalone debug info file. The compiler puts DWARF in the .o file, the linker adds some records in the executable which help us to understand where files/function/symbols landed in the final executable[1]. If the user runs our gdb or lldb on one of these binaries, the debugger will read the DWARF directly out of the .o files on the fly. Because the linker doesn't need to copy around/update/modify the DWARF, link times are very fast. If the developer decides to debug the program, no extra steps are required - the debugger can be started up & used with the debug info still in the .o files. Clearly this is only viable if you have the .o files on your computer so we added a command, "dsymutil", which links the DWARF from the .o files into a single standalone ".dSYM" file. The executable file and the dSYM file have a shared 128-bit number to ensure that the debug info and the executable match; the debugger will ignore a dSYM with a non-matching UUID for a given executable. A developer will typically create a dSYM when they sending a copy of the binary to someone and want to provide debug information, or they are archiving a released binary, or they want to debug it on another machine (where the .o files will not be in place.) In practice people create dSYMs rarely -- when they are doing iterative development on their computer, all of the DWARF sits in the .o files unprocessed unless they launch a debugger, link times are fast. As a minor detail, the dSYM is just another executable binary image on our system (Mach-O file format), sans any of the text or data of the real binary file, with only the debug_info, etc. sections. The name "dSYM" was a little joke based on the CodeWarrior "xSYM" debug info format. J
Re: RFC: DWARF Extensions for Separate Debug Info Files ("Fission")
On Thu, Sep 22, 2011 at 6:35 PM, Jason Molenda wrote: > Because the linker doesn't need to copy around/update/modify the DWARF, > link times are very fast. AFAIU, the link times are fast only if all the files are local to the developers' machine. They will not be fast (and the .o files *will* need to be copied) if a distributed compilation system (a build farm) is used (as is the case here). Thanks, -- Paul Pluzhnikov