[RFA] dwarf2out.c:eliminate_regs() bug
I'm investigating an ICE on m68k architecture. I'm not quite sure what is the right way to fix the bug so I welcome any feedback on the below analysis. Compilation fails on the assert in dwarf2out.c:based_loc_descr(): /* We only use "frame base" when we're sure we're talking about the post-prologue local stack frame. We do this by *not* running register elimination until this point, and recognizing the special argument pointer and soft frame pointer rtx's. */ if (reg == arg_pointer_rtx || reg == frame_pointer_rtx) { rtx elim = eliminate_regs (reg, VOIDmode, NULL_RTX); if (elim != reg) { if (GET_CODE (elim) == PLUS) { offset += INTVAL (XEXP (elim, 1)); elim = XEXP (elim, 0); } gcc_assert ((SUPPORTS_STACK_ALIGNMENT && (elim == hard_frame_pointer_rtx || elim == stack_pointer_rtx)) || elim == (frame_pointer_needed ? hard_frame_pointer_rtx : stack_pointer_rtx)); This code uses eliminate_regs(), which implicitly assumes reload_completed as it uses reg_eliminate[], which assumes that frame_pointer_needed is properly set, which happens in ira.c. However, in some cases this piece of based_loc_descr() can be reached during inlining pass (see backtrace below). When called before reload, eliminate_regs() may return an inconsistent result which is why the assert in based_loc_descr() fails. In the particular testcase I'm investigating, frame_pointer_needed is 0 (initial value), but eliminate_regs returns stack_pointer_rtx because it is guided by reg_eliminate information from the previous function which had frame_pointer_needed set to 1. Now, how do we fix this? For starters, it seems to be a good idea to assert (reload_in_progress | reload_completed) in eliminate_regs. Then, there are users of eliminate_regs in dbxout.c, dwarf2out.c, and sdbout.c not counting reload and machine-specific parts. From the 3 *out.c backends, only dwarf2out.c handles abstract functions, which is what causing it to be called before reload afaik, so the task seems to be in fixing the dwarf2out code. There are two references to eliminate_regs in dwarf2out. The first reference -- in based_loc_descr -- can *probably* be handled by adding reload_completed to the 'if' condition. The second is in compute_frame_pointer_to_fb_displacement. I'm no expert in dwarf2out.c code, but from the looks of it, it seems that compute_..._displacement is only called after reload, so a simple gcc_assert (reload_completed) may be enough there. One last note, I'm investigating this bug against 4.4 branch as it doesn't trigger on the mainline. Progression search on the mainline showed that failure became latent after this patch (http://gcc.gnu.org/viewcvs?view=revision&revision=147436) to inlining heuristics. -- Maxim K. CodeSourcery The backtrace: #0 eliminate_regs_1 (x=0xf7d60280, mem_mode=VOIDmode, insn=0x0, may_use_invariant=0 '\0') at gcc/reload1.c:2481 #1 0x0839e9b1 in eliminate_regs (x=0xf7d60280, mem_mode=VOIDmode, insn=0x0) at gcc/reload1.c:2870 #2 0x0821cf66 in based_loc_descr (reg=0xf7d60280, offset=8, initialized=VAR_INIT_STATUS_INITIALIZED) at gcc/dwarf2out.c:9868 #3 0x0821d7a7 in mem_loc_descriptor (rtl=0xf700bd98, mode=SImode, initialized=VAR_INIT_STATUS_INITIALIZED) at gcc/dwarf2out.c:10158 #4 0x0821dd55 in loc_descriptor (rtl=0xf700bc90, initialized=VAR_INIT_STATUS_INITIALIZED) at gcc/dwarf2out.c:10330 #5 0x0821ddde in loc_descriptor (rtl=0xf700d7a0, initialized=VAR_INIT_STATUS_INITIALIZED) at gcc/dwarf2out.c:10349 #6 0x082205d6 in add_location_or_const_value_attribute (die=0xf702ad20, decl=0xf73922d0, attr=DW_AT_location) at gcc/dwarf2out.c:11841 #7 0x08223412 in gen_formal_parameter_die (node=0x0, origin=0xf73922d0, context_die=0xf702ace8) at gcc/dwarf2out.c:13349 #8 0x082273c6 in gen_decl_die (decl=0x0, origin=0xf73922d0, context_die=0xf702ace8) at gcc/dwarf2out.c:15388 #9 0x082268aa in process_scope_var (stmt=0xf7163620, decl=0x0, origin=0xf73922d0, context_die=0xf702ace8) at gcc/dwarf2out.c:14969 #10 0x0822698d in decls_for_scope (stmt=0xf7163620, context_die=0xf702ace8, depth=5) at gcc/dwarf2out.c:14993 #11 0x08225192 in gen_lexical_block_die (stmt=0xf7163620, context_die=0xf702a498, depth=5) at gcc/dwarf2out.c:14266 #12 0x082253b5 in gen_inlined_subroutine_die (stmt=0xf7163620, context_die=0xf702a498, depth=5) at gcc/dwarf2out.c:14308 #13 0x08226711 in gen_block_die (stmt=0xf7163620, context_die=0xf702a498, depth=5) at gcc/dwarf2out.c:14935 #14 0x082269ee in decls_for_scope (stmt=0xf7163038, context_die=0xf702a498, depth=4) at gcc/dwarf2out.c:15005 #15 0x08225192 in gen_lexical_block_die (stmt=0xf7163038, context_die=0xf7026f18, depth=4) at gcc/dwarf2out.c:14266 #16 0x0822672c in gen_block_die (stmt=0xf7163038, c
Re: question about speculative scheduling in gcc
Amker.Cheng wrote: Hi : I'm puzzled when looking into speculative scheduling in gcc, the 4.2.4 version. First, I noticed the document describing IBM haifa instruction scheduler(as PowerPC Reference Compiler Optimization Project). It presents that the instruction motion from bb s(dominated by t) to t is speculative when split_blocks(s, t) not empty. Second, There is SCED_FLAGS like DO_SPECULATION in codes. These are two different types of speculative optimizations. Here goes questions. 1, Does the DO_SPECULATION flag constrol whether do the mentioned speculative motion or not? DO_SPECULATION flag controls generation of IA64 data and control speculative instructions. It is not used on other architectures. Speculative instruction moves from the split blocks are controlled by flag_schedule_speculative. -- Maxim
Re: question about speculative scheduling in gcc
On Sun, Sep 20, 2009 at 3:43 PM, Maxim Kuvyrkov wrote: > Amker.Cheng wrote: >> >> Hi : >> I'm puzzled when looking into speculative scheduling in gcc, the 4.2.4 >> version. >> >> First, I noticed the document describing IBM haifa instruction >> scheduler(as PowerPC Reference Compiler Optimization Project). >> >> It presents that the instruction motion from bb s(dominated by t) >> to t is speculative when split_blocks(s, t) not empty. >> >> Second, There is SCED_FLAGS like DO_SPECULATION in codes. > > These are two different types of speculative optimizations. > >> >> Here goes questions. >> 1, Does the DO_SPECULATION flag constrol whether do the >> mentioned speculative motion or not? > > DO_SPECULATION flag controls generation of IA64 data and control speculative > instructions. It is not used on other architectures. > > Speculative instruction moves from the split blocks are controlled by > flag_schedule_speculative. > > -- > Maxim > Yes! I've just found it's used for IA64 and was merged into gcc in version 4.2.0. Thanks. -- Best Regards.
Re: GCC 4.5 Status Report (2009-09-19)
On Sun, 20 Sep 2009, Dave Korn wrote: > Richard Guenther wrote: > > > The trunk is in Stage 1. Stage 1 will end on Sep 30th. After Stage 1 > > Stage 3 follows with only bugfixes and no new features allowed. > > Stage 3 will end Nov 30th. I'll answer your and Jacks question together. > I don't think this is the best time to do that. Trunk's been broken most of > last week and will probably not be buildable for at least several days yet, so > a big chunk of that warning period is going to be useless to many people. The effective useless period is extended by not including the 2-3 days necessary for the LTO merge in Stage 1. > > We've been accumulating quite a number of P1 bugs. Entering Stage 3 > > should allow to improve considerably here in a short time. > > So aren't we now likely to lose the first few days of what little remains of > stage 1 waiting for trunk to start working again, then have a mad rush of > people falling all over each other to get their new features in in the last > couple of days? One of which will inevitably break trunk again and block all > the others and then stage 1 will be over and it'll all be too late? I am not aware of any big patches that are still pending. Coming up with new yet unknown things now wouldn't be a good timing anyway. > Ten days isn't even that much warning in the first place; it's less time > than you'd generally let elapse before pinging a patch. Can I at least raise > the suggestion that a plan that might work well would be for us to go slush > for however long - less than a week, I'd suppose - it takes us to get trunk > really properly stable, and then push back the end of stage 1 by that amount? Note that Stage 3 isn't that strict as it may sound. Maintainers have quite amount of flexibility deciding what is considered a bug and thus a bugfix during Stage 3 (note that Stage3 is _not_ only for regression fixes). This includes obviously Graphite and LTO as well as target specific changes. What you won't see in Stage 3 is rewrites of infrastructure or adding of new optimization passes. Note that we have been in Stage 1 for about six month now which is IMHO enough (and unsurprisinlgy matches the length of Stage 1 of GCC 4.4). Richard.
Re: [RFA] dwarf2out.c:eliminate_regs() bug
On Sun, Sep 20, 2009 at 9:38 AM, Maxim Kuvyrkov wrote: > I'm investigating an ICE on m68k architecture. I'm not quite sure what is > the right way to fix the bug so I welcome any feedback on the below > analysis. > > Compilation fails on the assert in dwarf2out.c:based_loc_descr(): > > /* We only use "frame base" when we're sure we're talking about the > post-prologue local stack frame. We do this by *not* running > register elimination until this point, and recognizing the special > argument pointer and soft frame pointer rtx's. */ > if (reg == arg_pointer_rtx || reg == frame_pointer_rtx) > { > rtx elim = eliminate_regs (reg, VOIDmode, NULL_RTX); > > if (elim != reg) > { > if (GET_CODE (elim) == PLUS) > { > offset += INTVAL (XEXP (elim, 1)); > elim = XEXP (elim, 0); > } > gcc_assert ((SUPPORTS_STACK_ALIGNMENT > && (elim == hard_frame_pointer_rtx > || elim == stack_pointer_rtx)) > || elim == (frame_pointer_needed > ? hard_frame_pointer_rtx > : stack_pointer_rtx)); > > > This code uses eliminate_regs(), which implicitly assumes reload_completed > as it uses reg_eliminate[], which assumes that frame_pointer_needed is > properly set, which happens in ira.c. However, in some cases this piece of > based_loc_descr() can be reached during inlining pass (see backtrace below). > When called before reload, eliminate_regs() may return an inconsistent > result which is why the assert in based_loc_descr() fails. In the > particular testcase I'm investigating, frame_pointer_needed is 0 (initial > value), but eliminate_regs returns stack_pointer_rtx because it is guided by > reg_eliminate information from the previous function which had > frame_pointer_needed set to 1. > > Now, how do we fix this? For starters, it seems to be a good idea to assert > (reload_in_progress | reload_completed) in eliminate_regs. Then, there are > users of eliminate_regs in dbxout.c, dwarf2out.c, and sdbout.c not counting > reload and machine-specific parts. From the 3 *out.c backends, only > dwarf2out.c handles abstract functions, which is what causing it to be > called before reload afaik, so the task seems to be in fixing the dwarf2out > code. > > There are two references to eliminate_regs in dwarf2out. The first > reference -- in based_loc_descr -- can *probably* be handled by adding > reload_completed to the 'if' condition. The second is in > compute_frame_pointer_to_fb_displacement. I'm no expert in dwarf2out.c > code, but from the looks of it, it seems that compute_..._displacement is > only called after reload, so a simple gcc_assert (reload_completed) may be > enough there. > > One last note, I'm investigating this bug against 4.4 branch as it doesn't > trigger on the mainline. Progression search on the mainline showed that > failure became latent after this patch > (http://gcc.gnu.org/viewcvs?view=revision&revision=147436) to inlining > heuristics. I think you should avoid calling eliminate_regs for DECL_ABSTRACT current_function_decl. That should cover the inliner path. Richard. > -- > Maxim K. > CodeSourcery > > The backtrace: > #0 eliminate_regs_1 (x=0xf7d60280, mem_mode=VOIDmode, insn=0x0, > may_use_invariant=0 '\0') at gcc/reload1.c:2481 > #1 0x0839e9b1 in eliminate_regs (x=0xf7d60280, mem_mode=VOIDmode, insn=0x0) > at gcc/reload1.c:2870 > #2 0x0821cf66 in based_loc_descr (reg=0xf7d60280, offset=8, > initialized=VAR_INIT_STATUS_INITIALIZED) at gcc/dwarf2out.c:9868 > #3 0x0821d7a7 in mem_loc_descriptor (rtl=0xf700bd98, mode=SImode, > initialized=VAR_INIT_STATUS_INITIALIZED) at gcc/dwarf2out.c:10158 > #4 0x0821dd55 in loc_descriptor (rtl=0xf700bc90, > initialized=VAR_INIT_STATUS_INITIALIZED) at gcc/dwarf2out.c:10330 > #5 0x0821ddde in loc_descriptor (rtl=0xf700d7a0, > initialized=VAR_INIT_STATUS_INITIALIZED) at gcc/dwarf2out.c:10349 > #6 0x082205d6 in add_location_or_const_value_attribute (die=0xf702ad20, > decl=0xf73922d0, attr=DW_AT_location) at gcc/dwarf2out.c:11841 > #7 0x08223412 in gen_formal_parameter_die (node=0x0, origin=0xf73922d0, > context_die=0xf702ace8) at gcc/dwarf2out.c:13349 > #8 0x082273c6 in gen_decl_die (decl=0x0, origin=0xf73922d0, > context_die=0xf702ace8) at gcc/dwarf2out.c:15388 > #9 0x082268aa in process_scope_var (stmt=0xf7163620, decl=0x0, > origin=0xf73922d0, context_die=0xf702ace8) at gcc/dwarf2out.c:14969 > #10 0x0822698d in decls_for_scope (stmt=0xf7163620, context_die=0xf702ace8, > depth=5) at gcc/dwarf2out.c:14993 > #11 0x08225192 in gen_lexical_block_die (stmt=0xf7163620, > context_die=0xf702a498, depth=5) at gcc/dwarf2out.c:14266 > #12 0x082253b5 in gen_inlined_subroutine_die (stmt=0xf7163620, > context_die=0xf702a498, depth=5) at gcc/dwarf2out.c:14308 > #13 0x08226711 in gen_block_die (stmt=0xf7163620, context_die=0xf702a498, > depth=5
C++: variable length arrays and operator new[]
G++ currently accepts the following code: char * alloc(unsigned a, unsigned b) { typedef char array[a]; return &**(new array[b]); } Is this intentional? The equivalent "new char[a][b]" is rejected (as required by the C++ standard).
Re: [RFA] dwarf2out.c:eliminate_regs() bug
Richard Guenther wrote: On Sun, Sep 20, 2009 at 9:38 AM, Maxim Kuvyrkov wrote: ... This code uses eliminate_regs(), which implicitly assumes reload_completed as it uses reg_eliminate[], which assumes that frame_pointer_needed is properly set, which happens in ira.c. However, in some cases this piece of based_loc_descr() can be reached during inlining pass (see backtrace below). When called before reload, eliminate_regs() may return an inconsistent result which is why the assert in based_loc_descr() fails. In the particular testcase I'm investigating, frame_pointer_needed is 0 (initial value), but eliminate_regs returns stack_pointer_rtx because it is guided by reg_eliminate information from the previous function which had frame_pointer_needed set to 1. ... I think you should avoid calling eliminate_regs for DECL_ABSTRACT current_function_decl. That should cover the inliner path. Thanks for the insight. Do you mean something like the attached patch? -- Maxim Index: gcc/dwarf2out.c === --- gcc/dwarf2out.c (revision 261914) +++ gcc/dwarf2out.c (working copy) @@ -9862,8 +9862,11 @@ based_loc_descr (rtx reg, HOST_WIDE_INT /* We only use "frame base" when we're sure we're talking about the post-prologue local stack frame. We do this by *not* running register elimination until this point, and recognizing the special - argument pointer and soft frame pointer rtx's. */ - if (reg == arg_pointer_rtx || reg == frame_pointer_rtx) + argument pointer and soft frame pointer rtx's. + We might get here during the inlining pass (DECL_ABSTRACT is true then), + so don't try eliminating registers in such a case. */ + if (!DECL_ABSTRACT (current_function_decl) + && (reg == arg_pointer_rtx || reg == frame_pointer_rtx)) { rtx elim = eliminate_regs (reg, VOIDmode, NULL_RTX); @@ -12224,6 +12227,9 @@ compute_frame_pointer_to_fb_displacement offset += ARG_POINTER_CFA_OFFSET (current_function_decl); #endif + /* Make sure we don't try eliminating registers in abstract function. */ + gcc_assert (!DECL_ABSTRACT (current_function_decl)); + elim = eliminate_regs (reg, VOIDmode, NULL_RTX); if (GET_CODE (elim) == PLUS) { Index: gcc/reload1.c === --- gcc/reload1.c (revision 261914) +++ gcc/reload1.c (working copy) @@ -2867,6 +2867,7 @@ eliminate_regs_1 (rtx x, enum machine_mo rtx eliminate_regs (rtx x, enum machine_mode mem_mode, rtx insn) { + gcc_assert (reload_in_progress || reload_completed); return eliminate_regs_1 (x, mem_mode, insn, false); }
Re: [RFA] dwarf2out.c:eliminate_regs() bug
On Sun, Sep 20, 2009 at 1:18 PM, Maxim Kuvyrkov wrote: > Richard Guenther wrote: >> >> On Sun, Sep 20, 2009 at 9:38 AM, Maxim Kuvyrkov >> wrote: > > ... >>> >>> This code uses eliminate_regs(), which implicitly assumes >>> reload_completed >>> as it uses reg_eliminate[], which assumes that frame_pointer_needed is >>> properly set, which happens in ira.c. However, in some cases this piece >>> of >>> based_loc_descr() can be reached during inlining pass (see backtrace >>> below). >>> When called before reload, eliminate_regs() may return an inconsistent >>> result which is why the assert in based_loc_descr() fails. In the >>> particular testcase I'm investigating, frame_pointer_needed is 0 (initial >>> value), but eliminate_regs returns stack_pointer_rtx because it is guided >>> by >>> reg_eliminate information from the previous function which had >>> frame_pointer_needed set to 1. > > ... >> >> I think you should avoid calling eliminate_regs for DECL_ABSTRACT >> current_function_decl. That should cover the inliner path. > > Thanks for the insight. Do you mean something like the attached patch? Yes, though we should probably try to catch the DECL_ABSTRACT case further up the call chain - there shouldn't be any location lists for abstract function. Thus, see why static dw_die_ref gen_formal_parameter_die (tree node, tree origin, dw_die_ref context_die) ... if (! DECL_ABSTRACT (node_or_origin)) add_location_or_const_value_attribute (parm_die, node_or_origin, DW_AT_location); the node_or_origin of the param isn't DECL_ABSTRACT. In the end the above check should have avoided the situation you run into. Richard.
Re: GCC 4.5 Status Report (2009-09-19)
Richard Guenther wrote: > Note that Stage 3 isn't that strict as it may sound. Maintainers have > quite amount of flexibility deciding what is considered a bug and thus > a bugfix during Stage 3 (note that Stage3 is _not_ only for regression > fixes). This includes obviously Graphite and LTO as well as target > specific changes. > > What you won't see in Stage 3 is rewrites of infrastructure or adding of > new optimization passes. Thanks Richard, that's pretty reassuring. BTW, why don't we call this more-flexible-stage-3 "stage 2" any more? It sounds a lot like the way that's still described on develop.html. cheers, DaveK
Re: GCC 4.5 Status Report (2009-09-19)
On Sun, 20 Sep 2009, Dave Korn wrote: > Richard Guenther wrote: > > > Note that Stage 3 isn't that strict as it may sound. Maintainers have > > quite amount of flexibility deciding what is considered a bug and thus > > a bugfix during Stage 3 (note that Stage3 is _not_ only for regression > > fixes). This includes obviously Graphite and LTO as well as target > > specific changes. > > > > What you won't see in Stage 3 is rewrites of infrastructure or adding of > > new optimization passes. > > Thanks Richard, that's pretty reassuring. > > BTW, why don't we call this more-flexible-stage-3 "stage 2" any more? It > sounds a lot like the way that's still described on develop.html. Because "New functionality may not be introduced during this period." is still true for this stage 3 and "support for a new language construct might be added in a front-end" is also not wanted. Richard.
Re: GCC 4.5 Status Report (2009-09-19)
Richard Guenther wrote: > On Sun, 20 Sep 2009, Dave Korn wrote: >> BTW, why don't we call this more-flexible-stage-3 "stage 2" any more? It >> sounds a lot like the way that's still described on develop.html. > > Because "New functionality may not be introduced during this period." is > still true for this stage 3 and "support for a new language construct > might be added in a front-end" is also not wanted. Ah, thanks. I missed the discussion when stage 2 fell out of use, it would be nice if someone who was there at the time added a note to develop.html - is it an ad-hoc thing that we've just done for a couple of releases because it made sense at the time or was there an SC decision to permanently modify the development plan? cheers, DaveK
[PATCH] Adjust develop.html to reflect recent practice
As commented to my last status report develop.html does not reflect reality anymore. The following tries to adjust it carefully in this respect. Ok for www? Thanks, Richard. 2009-09-20 Richard Guenther * develop.html: Adjust to reflect recent practice. Index: develop.html === RCS file: /cvs/gcc/wwwdocs/htdocs/develop.html,v retrieving revision 1.100 diff -u -r1.100 develop.html --- develop.html4 Aug 2009 22:36:39 - 1.100 +++ develop.html20 Sep 2009 12:32:39 - @@ -38,7 +38,7 @@ better serve the user community by making releases somewhat more frequently, and on a consistent schedule. -In addition, a consistent schedule will make it possible for the +In addition, a consistent schedule will make it possible for a Release Manager to better understand his or her time commitment will be when he or she agrees to take the job. @@ -102,37 +102,35 @@ Schedule -Development on our main branch will proceed in three stages. Each -stage will be two months in length. +Development on our main branch will proceed in three stages. Stage 1 During this period, changes of any nature may be made to the compiler. In particular, major changes may be merged from branches. -In order to avoid chaos, the Release Manager will ask for a list of +Stage 1 and its length is feature driven and its length will be +at least four month. +In order to avoid chaos, the Release Managers will ask for a list of major projects proposed for the coming release cycle before the start -of Stage 1. The Release Manager will attempt to sequence the projects -in such a way as to cause minimal disruption. The Release Manager +of Stage 1. The Release Managers will attempt to sequence the projects +in such a way as to cause minimal disruption. The Release Managers will not reject projects that will be ready for inclusion before the -end of Stage 1. Similarly, the Release Manager has no special power -to accept a particular patch or branch beyond what his or her status -as a maintainer affords. The Release Manager's role during Stage 1 is +end of Stage 1. Similarly, the Release Managers have no special power +to accept a particular patch or branch beyond what their status +as maintainers affords. The Release Managers role during Stage 1 is merely to attempt to order the inclusion of major features in an organized manner. Stage 2 -During this period, major changes may not be merged from branches. -However, other smaller improvements may be made. For example, support -for a new language construct might be added in a front-end, or support -for a new variant of an existing microprocessor might be added to a -back-end. +Stage 2 has been abandoned in favor of an extended feature driven +Stage 1 since the development of GCC 4.4. Stage 3 -During this period, the only (non-documentation) changes that may be -made are changes that fix bugs or new ports which do not require changes -to other parts of the compiler. +During this two-month period, the only (non-documentation) changes +that may be made are changes that fix bugs or new ports which do not +require changes to other parts of the compiler. New functionality may not be introduced during this period. Rationale @@ -143,13 +141,14 @@ cycle, so that we have time to fix any problems that result. In order to reach higher standards of quality, we must focus on -fixing bugs; by working exclusively on bug-fixing through Stage 3, we -will have a higher quality source base as we prepare for a release. +fixing bugs; by working exclusively on bug-fixing through Stage 3 +and before branching for the release, we will have a higher quality +source base as we prepare for a release. Although maintaining a development branch, including merging new changes from the mainline, is somewhat burdensome, the absolute worst -case is that such a branch will have to be maintained for four months. -During two of those months, the only mainline changes will be bug-fixes, +case is that such a branch will have to be maintained for six months. +During this period, the only mainline changes will be bug-fixes, so it is unlikely that many conflicts will occur. @@ -203,20 +202,17 @@ Schedule -At the conclusion of Stage 3, a release branch will be created. - -On the release branch, the focus will be fixing any regressions +At the conclusion of Stage 3, the trunk will go into release branch +mode which allows documentation and regression fixes only. +During this phase, the focus will be fixing any regressions from the previous release, so that each release is better than the one before. -The release will occur two months after the creation of the branch. -(Stage 1 of the next release cycle will occur in parallel.) If, -however, support for an important platform has regressed significantly -from the previous release or support for a platform with active -maintenance has regres
Bitfields
I wonder if there would be at least a theoretical support by the developers to a proposal for volatile bitfields: When a HW register (thus most likely declared as volatile) is defined as a bitfield, as far as I know gcc treats each bitfield assignment as a separate read-modify-write operation. Thats is, if I have a 32-bit register with 3 fields struct s_hw_reg { int field1 : 10, field2 : 10, field3 : 12; }; then reg.field1 = val1; reg.field2 = val2; will be turned into a fetch, mask, or with val1, store, fetch, mask, or with val2, store sequence. I wonder if there could be a special gcc extension, strictly only when a -f option is explicitely passed to the compiler, where the comma operator could be used to tell the compiler to concatenate the operations: reg.field1 = val1, reg.field2 = val2; would then turn into fetch, mask with a combined mask of field1 and field2, or val1, or val2, store. Since the bit field operations can not be concatenated that way currently, and quite frequently you want to change multiple fields in a HW register simultaneously (i.e. with a single write), more often than not you have to give up the whole bit field notion and define everything like #define MASK1 0xffc0 #define MASK2 0x003ff000 #define MASK3 0x0fff and so on, then you explicitely write the code that fetches, masks with a compined mas, or-s with a combined field value set and stores. A lot of typing could be avoided with the bitfields, not to mention that it would be a lot more elegant, if one could somehow coerce the compiler to be a bit more relaxed regarding to bitfield access. Actually 'relaxed' is not a good word, because I would not want the compiler to have a free reign in the access: if there's a semicolon at the end of the assignment operator expression, then do it bit by bit, adhering the standard to its strictest. However, the comma operator, and only that operator, and only if both sides of the comma refer to bit fields within the same word, and only if explicitely asked by a command line switch, would tell the compiler to combine the masking and setting operations within a single fetch - store pair. Is it a completely brain-dead idea? Zoltan
what does the calling for min_insn_conflict_delay mean
Hi : In function new_ready, it calls to min_insn_conflict_delay with "min_insn_conflict_delay (curr_state, next, next)". But the function's comments say that it returns minimal delay of issue of the 2nd insn after issuing the 1st in given state. Why the last two parameter for the call are both "next"? seems conflict with the comments. Thanks. -- Best Regards.
Re: [PATCH] Adjust develop.html to reflect recent practice
Richard Guenther wrote: > * develop.html: Adjust to reflect recent practice. TYVM :) cheers, DaveK
Re: Bitfields
reg.field1 = val1, reg.field2 = val2; would then turn into fetch, mask with a combined mask of field1 and field2, or val1, or val2, store. You can also do the RMW yourself: declare the register volatile, but not the fields of it, and copy in/out of the register manually. volatile struct reg x; ... { struct reg mine = x; mine.field1 = true; mine.field2 = 0; mine.field3++; x = mine; } Is it a completely brain-dead idea? If I understood it correctly, it would not be standard compliant. Paolo
Re: Bitfields
Paolo Bonzini wrote: Is it a completely brain-dead idea? If I understood it correctly, it would not be standard compliant. But it's an extension, so I don't see that is an issue of itself. Paolo
Postreload and STRICT_LOW_PART
Why is postreload converting (set (REGX) (CONST_INT A)) ... (set (REGX) (CONST_INT B)) into (set (STRICT_LOW_PART (REGX)) (CONST_INT B))? That looks like a pessimisation especially if the constants are small, since STRICT_LOW_PART must not touch the high part. Is there a way for the backend to stop postreload from doing this if the constants are in some range? On the m68k, loading a constant in the range -128..127 in SI mode is better than loading it in strict QI mode. Andreas. -- Andreas Schwab, sch...@linux-m68k.org GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 "And now for something completely different."
Re: [PATCH] Adjust develop.html to reflect recent practice
Richard Guenther wrote: > 2009-09-20 Richard Guenther > > * develop.html: Adjust to reflect recent practice. OK. -- Mark Mitchell CodeSourcery m...@codesourcery.com (650) 331-3385 x713
Re: Postreload and STRICT_LOW_PART
On Sun, Sep 20, 2009 at 6:10 PM, Andreas Schwab wrote: > Why is postreload converting (set (REGX) (CONST_INT A)) ... (set (REGX) > (CONST_INT B)) into (set (STRICT_LOW_PART (REGX)) (CONST_INT B))? That > looks like a pessimisation especially if the constants are small, since > STRICT_LOW_PART must not touch the high part. Is there a way for the > backend to stop postreload from doing this if the constants are in some > range? On the m68k, loading a constant in the range -128..127 in SI > mode is better than loading it in strict QI mode. It's probably an omission of a check if A & ~GET_MODE_MASK (narrow_mode) is equal to zero. And of course a cost check is completely missing. Richard. > Andreas. > > -- > Andreas Schwab, sch...@linux-m68k.org > GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 > "And now for something completely different." >
Re: Postreload and STRICT_LOW_PART
On 09/20/2009 06:31 PM, Richard Guenther wrote: On Sun, Sep 20, 2009 at 6:10 PM, Andreas Schwab wrote: Why is postreload converting (set (REGX) (CONST_INT A)) ... (set (REGX) (CONST_INT B)) into (set (STRICT_LOW_PART (REGX)) (CONST_INT B))? That looks like a pessimisation especially if the constants are small, since STRICT_LOW_PART must not touch the high part. Is there a way for the backend to stop postreload from doing this if the constants are in some range? On the m68k, loading a constant in the range -128..127 in SI mode is better than loading it in strict QI mode. It's probably an omission of a check if A& ~GET_MODE_MASK (narrow_mode) is equal to zero. Actually it is (A^B) & ~GET_MODE_MASK(narrow_mode) that has to be 0. And of course a cost check is completely missing. I think that adding a cost check would be the right thing to do. For size, a QImode move is probably better (it is on i386 for example). Paolo
Cannot get Bit test RTL to cooperate with Combine.
All, I have been debugging AVR port to see why we fail to match so many bit test opportunities. When dealing with longer modes I have come across a problem I can not solve. Expansion in RTL for a bit test can produce two styles. STYLE 1 Bit to be tested is NOT LSB (e.g. if ( longthing & 0x10)), the expanded code contains the test as: (and:SI (reg:SI 45 [ lx.1 ]) (const_int 16 [0x10])) Bit tests are matched by combine. Combine has no problems with this and eventually creates a matching pattern based on the conversion of the AND to a zero extraction (set (pc) (if_then_else (ne (zero_extract:SI (subreg:QI (reg:SI 45 [ lx.1 ]) 0) (const_int 1 [0x1]) (const_int 4 [0x4])) (const_int 0 [0x0])) (label_ref:HI 133) (pc))) This will match Bit test patterns and produces optimal code. :-) STYLE 2 Bit to be tested is LSB (e.g. if ( longthing & 1)), the expanded RTL code uses SUBREG to lower width (apparently from SImode to word size). (and:HI (subreg:HI (reg:SI 45 [ lx.1 ]) 0) (const_int 1 [0x1])) This seems to occur regardless of -f(no)split-wide-types for size > HImode (which is integer mode). This RTL becomes a problem for combine Combine uses subst(), combine_simplify_rtx() and eventually simplify_comparison() where it attempts to WIDEN the AND and take the lowpart. ge_low_part(HImode, (and:SI (reg:SI 45 [ lx.1 ]) (const_int 1 [0x1])) ) However, gen_lowpart_for_combine() FAILS as it will reject taking lowpart of SImode expression because size>UNITS_PER_WORD. So no test pattern can be matched. :-( Style 2 is hugely problematic. The substitution works fine, but the simplification will always fail - making it apparently impossible to create matching patterns for bit tests of the LSB of SImode or DImode values. Any clues how I might get around this? Andy
Re: [PATCH] Adjust develop.html to reflect recent practice
Richard Guenther wrote: As commented to my last status report develop.html does not reflect reality anymore. The following tries to adjust it carefully in this respect. Schedule -Development on our main branch will proceed in three stages. Each -stage will be two months in length. +Development on our main branch will proceed in three stages. Just a minor tweak... Since there are only effectively two stages, wouldn't it be better to state two stages here ? Theo.
Re: [PATCH] Adjust develop.html to reflect recent practice
On Sun, 20 Sep 2009, Theodore Papadopoulo wrote: > Richard Guenther wrote: > > As commented to my last status report develop.html does not reflect > > reality anymore. The following tries to adjust it carefully in > > this respect. > > > > Schedule > > -Development on our main branch will proceed in three stages. Each > > -stage will be two months in length. > > +Development on our main branch will proceed in three stages. > Just a minor tweak... > Since there are only effectively two stages, wouldn't it be better to state > two stages here ? Effectively after leaving Stage 3 we enter a "Stage 4" before the release branch is created. So all, two, three and four would be in some way correct ;) Thus I didn't bother to change this detail. Richard.
Re: Bitfields
On Sun, 20 Sep 2009, Zolt??n K??csi wrote: > I wonder if there would be at least a theoretical support by the > developers to a proposal for volatile bitfields: It has been proposed (and not rejected, but not yet implemented) that volatile bit-fields should follow the ARM EABI specification (on all targets); that certainly seems better than inventing something new unless you have a very good reason to prefer the something new on some targets. -- Joseph S. Myers jos...@codesourcery.com
gcc-4.3-20090920 is now available
Snapshot gcc-4.3-20090920 is now available on ftp://gcc.gnu.org/pub/gcc/snapshots/4.3-20090920/ and on various mirrors, see http://gcc.gnu.org/mirrors.html for details. This snapshot has been generated from the GCC 4.3 SVN branch with the following options: svn://gcc.gnu.org/svn/gcc/branches/gcc-4_3-branch revision 151908 You'll find: gcc-4.3-20090920.tar.bz2 Complete GCC (includes all of below) gcc-core-4.3-20090920.tar.bz2 C front end and core compiler gcc-ada-4.3-20090920.tar.bz2 Ada front end and runtime gcc-fortran-4.3-20090920.tar.bz2 Fortran front end and runtime gcc-g++-4.3-20090920.tar.bz2 C++ front end and runtime gcc-java-4.3-20090920.tar.bz2 Java front end and runtime gcc-objc-4.3-20090920.tar.bz2 Objective-C front end and runtime gcc-testsuite-4.3-20090920.tar.bz2The GCC testsuite Diffs from 4.3-20090913 are available in the diffs/ subdirectory. When a particular snapshot is ready for public consumption the LATEST-4.3 link is updated and a message is sent to the gcc list. Please do not use a snapshot before it has been announced that way.
gasolina gratis
Si gasolina gratis para su automovil desde un 17% por cada tanque de combustible con el original maxifuel www.maxifuel.com.mx estamos en busca de distribuidores unete a la familia maxifuel normamaxif...@gmail.com utilidades garantizadas plazas disponibles
Re: Bitfields
On Sun, 20 Sep 2009, Joseph S. Myers wrote: > On Sun, 20 Sep 2009, Zolt??n K??csi wrote: > > > I wonder if there would be at least a theoretical support by the > > developers to a proposal for volatile bitfields: > > It has been proposed (and not rejected, but not yet implemented) that > volatile bit-fields should follow the ARM EABI specification (on all > targets); that certainly seems better than inventing something new unless > you have a very good reason to prefer the something new on some targets. Yes, that discussion was that made me thinking and suggesting this *before* the ARM EABI gets implemented. I don't suggest to implement something instead of the ARM EABI, I suggest to implement something on top of it. The suggested behaviour is also architecture-neutral. It is nothing more than if the user expressly asks the compiler to break the standard in a particular way, then the compiler does so. The breaking of the standard is at one single point. The ARM EABI spec clearly states that bitfield operations are never to be combined, not even in the case where consecutive bitfield assignments refer to bitfields located in the same machine word. My suggestion was that if a new command line switch is present, then in the special case of consecutive bitfield assignments being made to fields within the same word and the assignments being separated by the comma operator, then the compiler combines those assignments. The rationale of such behaviour is writing low-level code dealing with HW registers. To have a practical example, let's have a SoC chip with multi-function pins. Let's assume that we have a register that has 2 bits for each actual pin and the value of the 2 bits selects the actual function for the pin; a 32 bit register can thus control 16 pins. Now if you want to, say, assign 4 pins to the SPI interface, without bitfields you would (and indeed do) write something along these lines: temp = *pin_control_reg; temp &= ~(PIN_03_MASK | PIN_04_MASK | PIN_05_MASK | PIN_06_MASK); temp |= PIN_O3_MISO | PIN_04_MOSI | PIN_05_SCLK | PIN_06_SSEL; *pin_ctrl_reg = temp; You can't really use bitfields to achieve the above, because if you write pin_control_reg->pin_03 = MISO; pin_control_reg->pin_04 = MOSI; and so on, pin_xx being 2-bit wide bitfields, then according to the ARM EABI spec each statement would be translated to a temp = *pin_contorl_reg; temp &=...; temp |=...; *pin_control_reg=temp; sequence. What I suggest is that if you write pin_control_reg->pin_03 = MISO, // Note the comma pin_control_reg->pin_04 = MOSI, pin_control_reg->pin_05 = SCLK, pin_control_reg->pin_06 = SSEL; and compile it with a -fcomma-combines-bitfields switch, then you get the equivalent of the first code fragment where you manually combined the masks and the settings and only a single load and a single store was used. If the switch is not given or the consecutive assignments are not separated by commas or the bitfields do not belong to the same word, then the behaviour falls back to the default ARM EABI spec. The advantage of the suggested behaviour is that it would allow the use of the more elegant and expressive bitfields in place of the many hundreds of #define REGNAME_FIELDNAME_MASK and #define REGNAME_FIELDNAME_SHIFT macros that you can currently find in code that deals with HW. The suggestion does not introduce any new functionality or performance advantage, it just provides a way of writing (in my opinion) more readable and more maintainable code than what we have now with all the #defines. The fact that structure members live in their own namespace as opposed to the global #define namespace is an added benefit, of course. The suggested extension does not break backward compatibility, because the #define stuff would not be affected and the ARM EABI is not yet implemented anyway; it would not break the expected behaviour because it becomes active only when an explicite command line switch is given and has no side-effects outside the single expression where the subexpressions are separated by commas. The change, I believe, would benefit gcc users who deal with HW a lot, i.e. low level embedded system and device driver designers. Outside of that circle the suggested behavior would have only a little performance benefit. Zoltan