Re: No address_cost calls when inlining ?
> I want the version of foo because the store with an address as > destination is costly on my architecture, which is why I defined > TARGET_ADDRESS_COST and added a cost when I get this scenario. > However, in the compilation of this code, it seems that, when the > function is inlined, the address_cost function does not seem to be > called anymore. Any ideas why ? This is (a variant of) PR33699. Paolo
variadic templates supported in non-c++0x mode
Consider the one-liner C++ 'program': template struct pack; With the trunk, g++ -c [-std=gnu++98] gives: warning: variadic templates only available with -std=c++0x or -std=gnu++0x Should this not be an *error* instead? Variadic templates really should not be supported in non-c++0x mode. If not, how can I explicitly disable support for variadic templates? (The present behaviour frustrates some of our local autoconf tests.) Should I file a bug report? Regards, Jan van Dijk.
Re: variadic templates supported in non-c++0x mode
Hi, > Consider the one-liner C++ 'program': > > template struct pack; > > With the trunk, g++ -c [-std=gnu++98] gives: > > warning: variadic templates only available with -std=c++0x or -std=gnu++0x > > Should this not be an *error* instead? Variadic templates really should not > be > supported in non-c++0x mode. > > If not, how can I explicitly disable support for variadic templates? (The > present behaviour frustrates some of our local autoconf tests.) > I'm afraid the behavior you are seeing is by design, and is not specific to variadic templates. It also happens, for example, for: enum class e { }; But I agree it may cause problems. Before filing a PR, let's ask Jason, maybe he is willing to review for us the rationale... Paolo.
Re: [patch][4.5] Make regmove cfglayout-safe
Paolo Bonzini wrote: >> I also wondered about this. I think the original idea is that splits >> can call into dojump.c. > > A more likely possibility is -fnon-call-exceptions. Of course this is the main cause. But splitting one jump to multiple jumps is supported and actually even documented. It will happen for example in this testcase: int f(float x) { if (x != x) return 5; else abort (); } on i386 which produces fucomip %st(0), %st jp .L8 je .L6 It is possible to change this to an expander in the i386 md of course. I don't think any other backend is relying on it, but I will make a more thorough check if I end up submitting something like the attached patch. Paolo 2009-03-10 Paolo Bonzini * lower-subreg.c (decompose_multiword_subregs): Extract code... * cfgbuild.c (rtl_split_blocks_for_eh): ... here. * basic-block.h (rtl_split_blocks_for_eh): Declare it. * recog.c (split_insn): Return bool. Check that the splitter produces no barriers and no labels. (split_all_insns): Use the result. Call rtl_split_blocks_for_eh instead of find_many_sub_basic_blocks. * reload1.c (fixup_abnormal_edges): Use it. * passes.c (init_optimization_passes): Move cfglayout mode further down. Index: gcc/passes.c === --- gcc/passes.c(branch combine-cfglayout) +++ gcc/passes.c(working copy) @@ -757,8 +757,8 @@ init_optimization_passes (void) NEXT_PASS (pass_if_after_combine); NEXT_PASS (pass_partition_blocks); NEXT_PASS (pass_regmove); - NEXT_PASS (pass_outof_cfg_layout_mode); NEXT_PASS (pass_split_all_insns); + NEXT_PASS (pass_outof_cfg_layout_mode); NEXT_PASS (pass_lower_subreg2); NEXT_PASS (pass_df_initialize_no_opt); NEXT_PASS (pass_stack_ptr_mod); Index: gcc/recog.c === --- gcc/recog.c (branch combine-cfglayout) +++ gcc/recog.c (working copy) @@ -29,6 +29,7 @@ along with GCC; see the file COPYING3. #include "insn-config.h" #include "insn-attr.h" #include "hard-reg-set.h" +#include "except.h" #include "recog.h" #include "regs.h" #include "addresses.h" @@ -71,7 +72,6 @@ get_attr_enabled (rtx insn ATTRIBUTE_UNU static void validate_replace_rtx_1 (rtx *, rtx, rtx, rtx, bool); static void validate_replace_src_1 (rtx *, void *); -static rtx split_insn (rtx); /* Nonzero means allow operands to be volatile. This should be 0 if you are generating rtl, such as if you are calling @@ -2671,19 +2671,23 @@ reg_fits_class_p (rtx operand, enum reg_ } /* Split single instruction. Helper function for split_all_insns and - split_all_insns_noflow. Return last insn in the sequence if successful, - or NULL if unsuccessful. */ + split_all_insns_noflow. Return whether new control flow insns + were added. */ -static rtx +static bool split_insn (rtx insn) { /* Split insns here to get max fine-grain parallelism. */ rtx first = PREV_INSN (insn); rtx last = try_split (PATTERN (insn), insn, 1); rtx insn_set, last_set, note; + bool new_cfi = false; + bool was_cfi; if (last == insn) -return NULL_RTX; +return false; + + was_cfi = control_flow_insn_p (insn); /* If the original instruction was a single set that was known to be equivalent to a constant, see if we can say the same about the last @@ -2706,22 +2710,25 @@ split_insn (rtx insn) /* try_split returns the NOTE that INSN became. */ SET_INSN_DELETED (insn); - /* ??? Coddle to md files that generate subregs in post-reload - splitters instead of computing the proper hard register. */ - if (reload_completed && first != last) + while (first != last) { first = NEXT_INSN (first); - for (;;) + gcc_assert (!BARRIER_P (first) && !LABEL_P (first)); + + /* ??? Coddle to md files that generate subregs in post-reload + splitters instead of computing the proper hard register. */ + if (reload_completed && INSN_P (first)) + cleanup_subreg_operands (first); + if ((first != last || !was_cfi) + && control_flow_insn_p (first)) { - if (INSN_P (first)) - cleanup_subreg_operands (first); - if (first == last) - break; - first = NEXT_INSN (first); + gcc_assert (flag_non_call_exceptions + && can_throw_internal (first)); + new_cfi = true; } } - return last; + return new_cfi; } /* Split all insns in the function. If UPD_LIFE, update life info after. */ @@ -2730,12 +2737,10 @@ void split_all_insns (void) { sbitmap blocks; - bool changed; basic_block bb; blocks = sbitmap_alloc (last_basic_block); sbitmap_zero (blocks); - changed = false; FOR_EACH_BB_REVERSE (bb) { @@ -2753,41 +2758,17 @@ split_all_insns (void)
Re: variadic templates supported in non-c++0x mode
2009/3/10 Paolo Carlini : >> >> warning: variadic templates only available with -std=c++0x or -std=gnu++0x >> > I'm afraid the behavior you are seeing is by design, and is not specific In any case the wording of the warning is weird: it says variadic templates are not available but then it is accepted with just a warning. > to variadic templates. It also happens, for example, for: > > enum class e { }; > What is the warning here? Cheers, Manuel.
Re: variadic templates supported in non-c++0x mode
Manuel López-Ibáñez wrote: > In any case the wording of the warning is weird: it says variadic > templates are not available but then it is accepted with just a > warning. > I agree. >> to variadic templates. It also happens, for example, for: >> >> enum class e { }; >> >> > > What is the warning here? > Same story Manuel. All those situations are dealt with via the same maybe_warn_cpp0x. Paolo.
Re: variadic templates supported in non-c++0x mode
Jan van Dijk wrote: Consider the one-liner C++ 'program': template struct pack; With the trunk, g++ -c [-std=gnu++98] gives: warning: variadic templates only available with -std=c++0x or -std=gnu++0x Should this not be an *error* instead? Variadic templates really should not be supported in non-c++0x mode. If not, how can I explicitly disable support for variadic templates? (The present behaviour frustrates some of our local autoconf tests.) Should I file a bug report? The problem I fear is that variadic templates are already conveniently used as an implementation detail in libstdc++. And the warning there is probably hidden by the "system header" warning removal machinery. I have already been bitten by this : I have a configuration test program that passes (with a warning), while an error would be expected in non-C++0x mode, so the rest of the code uses it, which triggers lots of warning. My workaround for this is to test for G++0x mode explicitly in addition, and #error on it in my test program. -- Sylvain Pion INRIA Sophia-Antipolis Geometrica Project-Team CGAL, http://cgal.org/
Re: variadic templates supported in non-c++0x mode
Sylvain Pion wrote: > The problem I fear is that variadic templates are already conveniently > used as an implementation detail in libstdc++. And the warning there > is probably hidden by the "system header" warning removal machinery. It is, you are right. Paolo.
Re: variadic templates supported in non-c++0x mode
2009/3/10 Sylvain Pion : > > The problem I fear is that variadic templates are already conveniently > used as an implementation detail in libstdc++. And the warning there > is probably hidden by the "system header" warning removal machinery. > But then probably, variadic templates are implemented as a GCC extension to C++98 and they work fine with -std=c++98 despite what the warning says. Or don't they? Cheers, Manuel.
Re: variadic templates supported in non-c++0x mode
Manuel López-Ibáñez wrote: 2009/3/10 Sylvain Pion : The problem I fear is that variadic templates are already conveniently used as an implementation detail in libstdc++. And the warning there is probably hidden by the "system header" warning removal machinery. But then probably, variadic templates are implemented as a GCC extension to C++98 and they work fine with -std=c++98 despite what the warning says. Or don't they? Yes, but like any extension, it's nice to be able to disable them as errors, so as to be able to use GCC for checking code portability. -- Sylvain Pion INRIA Sophia-Antipolis Geometrica Project-Team CGAL, http://cgal.org/
Ira Rosen appointed Auto-Vectorizer Maintainer
I am pleased to announce that the GCC Steering Committee has appointed Ira Rosen as an Auto-Vectorizer maintainer. Please join me in congratulating Ira on her new role. Ira, please update your listing in the MAINTAINERS file. Happy hacking! David
Re: No address_cost calls when inlining ?
Ahhh ok, so basically I've hit the same wall as for the constant folding and constant propagation! Oh well, I will see how important it is for me to try to fix it in this case. Thanks for the answer, Jc On Tue, Mar 10, 2009 at 4:18 AM, Paolo Bonzini wrote: > >> I want the version of foo because the store with an address as >> destination is costly on my architecture, which is why I defined >> TARGET_ADDRESS_COST and added a cost when I get this scenario. >> However, in the compilation of this code, it seems that, when the >> function is inlined, the address_cost function does not seem to be >> called anymore. Any ideas why ? > > This is (a variant of) PR33699. > > Paolo > >
pr39339 - invalid testcase or SRA bug?
Hi, Since r144598, pr39339.c has been failing on picochip. On investigation, it looks to me that the testcase is illegal. Relevant source code: struct C { unsigned int c; struct D { unsigned int columns : 4; unsigned int fore : 9; unsigned int back : 9; unsigned int fragment : 1; unsigned int standout : 1; unsigned int underline : 1; unsigned int strikethrough : 1; unsigned int reverse : 1; unsigned int blink : 1; unsigned int half : 1; unsigned int bold : 1; unsigned int invisible : 1; unsigned int pad : 1; } attr; }; struct A { struct C *data; unsigned int len; }; struct B { struct A *cells; unsigned char soft_wrapped : 1; }; struct E { long row, col; struct C defaults; }; __attribute__ ((noinline)) void foo (struct E *screen, unsigned int c, int columns, struct B *row) { struct D attr; long col; int i; col = screen->col; attr = screen->defaults.attr; attr.columns = columns; row->cells->data[col].c = c; row->cells->data[col].attr = attr; col++; attr.fragment = 1; for (i = 1; i < columns; i++) { row->cells->data[col].c = c; row->cells->data[col].attr = attr; col++; } } int main (void) { struct E e = {.row = 5,.col = 0,.defaults = {6, {-1, -1, -1, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0}} }; struct C c[4]; struct A a = { c, 4 }; struct B b = { &a, 1 }; struct D d; __builtin_memset (&c, 0, sizeof c); foo (&e, 65, 2, &b); d = e.defaults.attr; d.columns = 2; if (__builtin_memcmp (&d, &c[0].attr, sizeof d)) __builtin_abort (); d.fragment = 1; if (__builtin_memcmp (&d, &c[1].attr, sizeof d)) __builtin_abort (); return 0; } In picochip, PCC_BITFIELD_TYPE_MATTERS is set and int is 16-bits, so the structure D becomes 6 bytes, with 3-bit padding between fore and back. At SRA the code becomes ;; Function foo (foo) foo (struct E * screen, unsigned int c, int columns, struct B * row) { unsigned int attr$B32F16; attr$B26F6; attr$back; attr$fore; attr$fragment; int i; long int col; struct C * D.1267; unsigned int D.1266; unsigned int D.1265; struct C * D.1264; struct A * D.1263; D.1262; unsigned char D.1261; : col_4 = screen_3(D)->col; attr$B32F16_36 = BIT_FIELD_REF defaults.attr, 16, 32>; attr$B26F6_37 = BIT_FIELD_REF defaults.attr, 6, 26>; attr$back_38 = screen_3(D)->defaults.attr.back; attr$fore_39 = screen_3(D)->defaults.attr.fore; attr$fragment_40 = screen_3(D)->defaults.attr.fragment; D.1261_6 = (unsigned char) columns_5(D); D.1262_7 = () D.1261_6; D.1263_9 = row_8(D)->cells; D.1264_10 = D.1263_9->data; D.1265_11 = (unsigned int) col_4; D.1266_12 = D.1265_11 * 8; D.1267_13 = D.1264_10 + D.1266_12; D.1267_13->c = c_14(D); BIT_FIELD_REF attr, 16, 32> = attr$B32F16_36; BIT_FIELD_REF attr, 6, 26> = attr$B26F6_37; D.1267_13->attr.back = attr$back_38; D.1267_13->attr.fore = attr$fore_39; D.1267_13->attr.fragment = attr$fragment_40; D.1267_13->attr.columns = D.1262_7; col_20 = col_4 + 1; if (columns_5(D) > 1) goto ; else goto ; : # col_29 = PHI # i_30 = PHI D.1265_24 = (unsigned int) col_29; D.1266_25 = D.1265_24 * 8; D.1267_26 = D.1264_10 + D.1266_25; D.1267_26->c = c_14(D); BIT_FIELD_REF attr, 16, 32> = attr$B32F16_36; BIT_FIELD_REF attr, 6, 26> = attr$B26F6_37; D.1267_26->attr.back = attr$back_38; D.1267_26->attr.fore = attr$fore_39; D.1267_26->attr.fragment = 1; D.1267_26->attr.columns = D.1262_7; col_32 = col_29 + 1; i_33 = i_30 + 1; if (columns_5(D) > i_33) goto ; else goto ; : return; } ;; Function main (main) main () { struct D d; struct B b; struct A a; struct C c[4]; struct E e; int D.1279; int D.1276; : e.row = 5; e.col = 0; e.defaults.c = 6; e.defaults.attr.columns = 15; e.defaults.attr.fore = 511; e.defaults.attr.back = 511; e.defaults.attr.fragment = 1; e.defaults.attr.standout = 0; e.defaults.attr.underline = 1; e.defaults.attr.strikethrough = 0; e.defaults.attr.reverse = 1; e.defaults.attr.blink = 0; e.defaults.attr.half = 1; e.defaults.attr.bold = 0; e.defaults.attr.invisible = 1; e.defaults.attr.pad = 0; a.data = &c; a.len = 4; b.cells = &a; b.soft_wrapped = 1; __builtin_memset (&c, 0, 32); foo (&e, 65, 2, &b); d = e.defaults.attr; d.columns = 2; D.1276_1 = __builtin_memcmp (&d, &c[0].attr, 6); if (D.1276_1 != 0) goto ; else goto ; : __builtin_abort (); : d.fragment = 1; D.1279_2 = __builtin_memcmp (&d, &c[1].attr, 6); if (D.1279_2 != 0) goto ; else goto ; : __builtin_abort (); : return 0; } Note that padding bits (13,16) are not copied over in bb_2 in function foo. main then does a memcmp, which fails because the padding bits are different. From C99 standards (p328), 265) The contents of ‘‘holes’’ used as padding for purposes of alignment within structure objects are indeterminate. Strings short
Re: pr39339 - invalid testcase or SRA bug?
On Tue, Mar 10, 2009 at 2:44 PM, Hariharan Sandanagobalane wrote: > Hi, > Since r144598, pr39339.c has been failing on picochip. On investigation, it > looks to me that the testcase is illegal. > > Relevant source code: > struct C > { > unsigned int c; > struct D > { > unsigned int columns : 4; > unsigned int fore : 9; > unsigned int back : 9; > unsigned int fragment : 1; > unsigned int standout : 1; > unsigned int underline : 1; > unsigned int strikethrough : 1; > unsigned int reverse : 1; > unsigned int blink : 1; > unsigned int half : 1; > unsigned int bold : 1; > unsigned int invisible : 1; > unsigned int pad : 1; > } attr; > }; > > struct A > { > struct C *data; > unsigned int len; > }; > > struct B > { > struct A *cells; > unsigned char soft_wrapped : 1; > }; > > struct E > { > long row, col; > struct C defaults; > }; > > __attribute__ ((noinline)) > void foo (struct E *screen, unsigned int c, int columns, struct B *row) > { > struct D attr; > long col; > int i; > col = screen->col; > attr = screen->defaults.attr; > attr.columns = columns; > row->cells->data[col].c = c; > row->cells->data[col].attr = attr; > col++; > attr.fragment = 1; > for (i = 1; i < columns; i++) > { > row->cells->data[col].c = c; > row->cells->data[col].attr = attr; > col++; > } > } > > int > main (void) > { > struct E e = {.row = 5,.col = 0,.defaults = > {6, {-1, -1, -1, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0}} }; > struct C c[4]; > struct A a = { c, 4 }; > struct B b = { &a, 1 }; > struct D d; > __builtin_memset (&c, 0, sizeof c); > foo (&e, 65, 2, &b); > d = e.defaults.attr; > d.columns = 2; > if (__builtin_memcmp (&d, &c[0].attr, sizeof d)) > __builtin_abort (); > d.fragment = 1; > if (__builtin_memcmp (&d, &c[1].attr, sizeof d)) > __builtin_abort (); > return 0; > } > > > In picochip, PCC_BITFIELD_TYPE_MATTERS is set and int is 16-bits, so the > structure D becomes 6 bytes, with 3-bit padding between fore and back. > > At SRA the code becomes > > ;; Function foo (foo) > > foo (struct E * screen, unsigned int c, int columns, struct B * row) > { > unsigned int attr$B32F16; > attr$B26F6; > attr$back; > attr$fore; > attr$fragment; > int i; > long int col; > struct C * D.1267; > unsigned int D.1266; > unsigned int D.1265; > struct C * D.1264; > struct A * D.1263; > D.1262; > unsigned char D.1261; > > : > col_4 = screen_3(D)->col; > attr$B32F16_36 = BIT_FIELD_REF defaults.attr, 16, 32>; > attr$B26F6_37 = BIT_FIELD_REF defaults.attr, 6, 26>; > attr$back_38 = screen_3(D)->defaults.attr.back; > attr$fore_39 = screen_3(D)->defaults.attr.fore; > attr$fragment_40 = screen_3(D)->defaults.attr.fragment; > D.1261_6 = (unsigned char) columns_5(D); > D.1262_7 = () D.1261_6; > D.1263_9 = row_8(D)->cells; > D.1264_10 = D.1263_9->data; > D.1265_11 = (unsigned int) col_4; > D.1266_12 = D.1265_11 * 8; > D.1267_13 = D.1264_10 + D.1266_12; > D.1267_13->c = c_14(D); > BIT_FIELD_REF attr, 16, 32> = attr$B32F16_36; > BIT_FIELD_REF attr, 6, 26> = attr$B26F6_37; > D.1267_13->attr.back = attr$back_38; > D.1267_13->attr.fore = attr$fore_39; > D.1267_13->attr.fragment = attr$fragment_40; > D.1267_13->attr.columns = D.1262_7; > col_20 = col_4 + 1; > if (columns_5(D) > 1) > goto ; > else > goto ; > > : > # col_29 = PHI > # i_30 = PHI > D.1265_24 = (unsigned int) col_29; > D.1266_25 = D.1265_24 * 8; > D.1267_26 = D.1264_10 + D.1266_25; > D.1267_26->c = c_14(D); > BIT_FIELD_REF attr, 16, 32> = attr$B32F16_36; > BIT_FIELD_REF attr, 6, 26> = attr$B26F6_37; > D.1267_26->attr.back = attr$back_38; > D.1267_26->attr.fore = attr$fore_39; > D.1267_26->attr.fragment = 1; > D.1267_26->attr.columns = D.1262_7; > col_32 = col_29 + 1; > i_33 = i_30 + 1; > if (columns_5(D) > i_33) > goto ; > else > goto ; > > : > return; > > } > > > > ;; Function main (main) > > main () > { > struct D d; > struct B b; > struct A a; > struct C c[4]; > struct E e; > int D.1279; > int D.1276; > > : > e.row = 5; > e.col = 0; > e.defaults.c = 6; > e.defaults.attr.columns = 15; > e.defaults.attr.fore = 511; > e.defaults.attr.back = 511; > e.defaults.attr.fragment = 1; > e.defaults.attr.standout = 0; > e.defaults.attr.underline = 1; > e.defaults.attr.strikethrough = 0; > e.defaults.attr.reverse = 1; > e.defaults.attr.blink = 0; > e.defaults.attr.half = 1; > e.defaults.attr.bold = 0; > e.defaults.attr.invisible = 1; > e.defaults.attr.pad = 0; > a.data = &c; > a.len = 4; > b.cells = &a; > b.soft_wrapped = 1; > __builtin_memset (&c, 0, 32); > foo (&e, 65, 2, &b); > d = e.defaults.attr; > d.columns = 2; > D.1276_1 = __builtin_memcmp (&d, &c[0].attr, 6); > if (D.1276_1 != 0) > goto ; > else > goto ; > > : > __builtin_abort (); > > : > d.fragment = 1; > D.1279_2 = __builtin_memcmp (&d, &c[1].attr, 6); > if (D.1279_2 != 0) > goto ; > else > goto ; > > : > __builtin_abort (); > > : > re
Re: variadic templates supported in non-c++0x mode
On Tue, Mar 10, 2009 at 6:58 AM, Sylvain Pion wrote: > Manuel López-Ibáñez wrote: >> >> 2009/3/10 Sylvain Pion : >>> >>> The problem I fear is that variadic templates are already conveniently >>> used as an implementation detail in libstdc++. And the warning there >>> is probably hidden by the "system header" warning removal machinery. >> >> But then probably, variadic templates are implemented as a GCC >> extension to C++98 and they work fine with -std=c++98 despite what the >> warning says. Or don't they? > > Yes, but like any extension, it's nice to be able to disable them > as errors, so as to be able to use GCC for checking code portability. libstdc++ ought to be able to use GNU extensions that make the implemnentation easier. Consequently, I do not see the complete removal of variadic templates under -stdc=c++98 as an option. However, people can propose patch to make extensions non-available outside system headers. -- Gaby
Re: variadic templates supported in non-c++0x mode
2009/3/10 Sylvain Pion : >> >> But then probably, variadic templates are implemented as a GCC >> extension to C++98 and they work fine with -std=c++98 despite what the >> warning says. Or don't they? > > Yes, but like any extension, it's nice to be able to disable them > as errors, so as to be able to use GCC for checking code portability. So use -pedantic-errors as it says in the manual. You should really use -pedantic-erros if you do not want extensions. -pedantic: Issue all the warnings demanded by strict ISO C and ISO C++; reject all programs that use forbidden extensions, and some other programs that do not follow ISO C and ISO C++. Cheers, Manuel.
Re: pr39339 - invalid testcase or SRA bug?
On Tue, Mar 10, 2009 at 01:44:11PM +, Hariharan Sandanagobalane wrote: > Since r144598, pr39339.c has been failing on picochip. On investigation, > it looks to me that the testcase is illegal. > > Relevant source code: > struct C > { > unsigned int c; > struct D > { > unsigned int columns : 4; > unsigned int fore : 9; > unsigned int back : 9; As the testcase fails with buggy (pre r144598) gcc and succeeds after even with: > unsigned int fore : 12; > unsigned int back : 6; instead of :9, :9, I think we could change it (does it succeed on picochip then)? Or move to gcc.dg/torture/ and run only on int32plus targets. Or add if (sizeof (int) != 4 || sizeof (struct D) != 4) return 0 to the beginning of main. Jakub
Re: variadic templates supported in non-c++0x mode
Manuel López-Ibáñez wrote: 2009/3/10 Sylvain Pion : But then probably, variadic templates are implemented as a GCC extension to C++98 and they work fine with -std=c++98 despite what the warning says. Or don't they? Yes, but like any extension, it's nice to be able to disable them as errors, so as to be able to use GCC for checking code portability. So use -pedantic-errors as it says in the manual. You should really use -pedantic-erros if you do not want extensions. Sure. It just forces you to have additional compiler-specific flags in your configuration system. -- Sylvain Pion INRIA Sophia-Antipolis Geometrica Project-Team CGAL, http://cgal.org/
Re: variadic templates supported in non-c++0x mode
Gabriel Dos Reis wrote: On Tue, Mar 10, 2009 at 6:58 AM, Sylvain Pion wrote: Manuel López-Ibáñez wrote: 2009/3/10 Sylvain Pion : The problem I fear is that variadic templates are already conveniently used as an implementation detail in libstdc++. And the warning there is probably hidden by the "system header" warning removal machinery. But then probably, variadic templates are implemented as a GCC extension to C++98 and they work fine with -std=c++98 despite what the warning says. Or don't they? Yes, but like any extension, it's nice to be able to disable them as errors, so as to be able to use GCC for checking code portability. libstdc++ ought to be able to use GNU extensions that make the implemnentation easier. Consequently, I do not see the complete removal of variadic templates under -stdc=c++98 as an option. However, people can propose patch to make extensions non-available outside system headers. Agreed. -- Sylvain Pion INRIA Sophia-Antipolis Geometrica Project-Team CGAL, http://cgal.org/
Re: pr39339 - invalid testcase or SRA bug?
Yes, if i change the structure to bring the 3 1-bit members forward, to avoid padding, the testcase does pass. Thanks to both of you for your help. Cheers Hari Jakub Jelinek wrote: On Tue, Mar 10, 2009 at 01:44:11PM +, Hariharan Sandanagobalane wrote: Since r144598, pr39339.c has been failing on picochip. On investigation, it looks to me that the testcase is illegal. Relevant source code: struct C { unsigned int c; struct D { unsigned int columns : 4; unsigned int fore : 9; unsigned int back : 9; As the testcase fails with buggy (pre r144598) gcc and succeeds after even with: unsigned int fore : 12; unsigned int back : 6; instead of :9, :9, I think we could change it (does it succeed on picochip then)? Or move to gcc.dg/torture/ and run only on int32plus targets. Or add if (sizeof (int) != 4 || sizeof (struct D) != 4) return 0 to the beginning of main. Jakub
Automatic Parallelization & Graphite - future plans
Hello, Described here is the future plan for automatic parallelization in GCC. The current autopar pass is based on GOMP infrastructure; it distributes iterations of loops to several threads (the number is instructed by the user) if it was determined that they are independent. The only dependency allowed to exist is reduction, which is handled as a special case. This pass was initially contributed to GCC4.3 by Zdenek Dvorak and Sebastian Pop. With the integration of Graphite (http://gcc.gnu.org/wiki/Graphite) to GCC4.4, a strong loop nest analysis and transformation engine was introduced, and the notion of using the polyhedral model to expose loop parallelism in GCC becomes feasible and relevant. Our prospective goals are to incrementally integrate autopar and Graphite. As in auto par, we'll initially focus on synchronization free parallelization. The first step, as we see it, will teach Graphite that parallel code needs to be produced. This means that Graphite will recognize simple parallel loops (using SCoP detection and data dependency analysis), and pass on that information. The information that needs to be conveyed expresses that a loop is parallelizable, and may also include annotations of more detailed information e.g, the shared/private variables. There are two possible models for the code generation: 1. Graphite will annotate parallel loops and pass that information all the way through CLOOG to the current autopar code generator to produce the parallel, GOMP based code. 2. Graphite will annotate the parallel loops and CLOOG itself will be responsible of generating the parallel code. A point to notice here is that scalars/reductions are currently not handled in Graphite. In the first model, where Graphite calls autopar's code generation, scalars can be handled. After Graphite finishes its analysis, it calls autopar's reduction analysis, and only then the code generation is called (if the scalar analysis determines that the loop still parallelizable, of course). Once the first step is accomplished, the following steps will focus on teaching Graphite to find loop transformations (such as skewing, interchange etc.) that expose coarse grain synchronization free parallelism. This will be heavily based on the polyhedral data dependence and transformation infrastructures. We have not determined which algorithm/ techniques we're going to use for this part. Having synchronization free parallelization integrated in Graphite, will set the ground for handling parallelism requiring a small amount of parallelization. This is a rough view for our planned work on autopar in GCC. Please feel free to ask/comment. Thanks, Razya
Implicit conversion from 32bits integer to 16bits integer (GCC 4.3.3)
Hello, I would like to have a warning when there is an Implicit conversion in source code: uint32_t foo = 2000; uint16_t bar = foo; //--> Warning Implicit conversion from uint32_t to uint16_t uint16_t bar2= (uint16_t) foo; // --> no warning. I now that -Wtraditional-conversion in GCC 4.3.x is able to show this message, but it also display warning for function implicit conversion: extern void foobar(uint16_t x); int main() { uint16_t a = 32; foobar(a); // --> Warning: implicit conversion in parameter foobar parameter 1 } Is it possible to have only warning for implicit conversion for variable and not for prototype? Best Regards Frederic
Re: Automatic Parallelization & Graphite - future plans
Hi Razya great to hear these Graphite plans. Some short comments. On Tue, 2009-03-10 at 16:13 +0200, Razya Ladelsky wrote: > [...] > > The first step, as we see it, will teach Graphite that parallel code needs > to be produced. > This means that Graphite will recognize simple parallel loops (using SCoP > detection and data dependency analysis), > and pass on that information. > The information that needs to be conveyed expresses that a loop is > parallelizable, and may also include annotations of more > detailed information e.g, the shared/private variables. > > There are two possible models for the code generation: > 1. Graphite will annotate parallel loops and pass that information all the > way through CLOOG > to the current autopar code generator to produce the parallel, GOMP based > code. It might be possible to recognize parallel loops in graphite, but you should keep in mind that in the graphite polyhedral representation loops do not yet exist. So you would have to foresee which loops CLOOG will produce. This might be possible depending how strict the scheduling we give to CLOOG is. Another problem is, that cloog might split some loops automatically (if possible) to reduce the control flow. > 2. Graphite will annotate the parallel loops and CLOOG itself will be > responsible of generating > the parallel code. The same as above. It will hard to mark loops as loops do not yet exist. > A point to notice here is that scalars/reductions are > currently not > handled in Graphite. We are working heavily on this. Expect it to be ready at least at the end of march. Hopefully the end of this week. > In the first model, where Graphite calls autopar's code generation, > scalars can be handled. 3. Wait for cloog to generate the new loops. As we have the polyhedral information (poly_bb_p) still available during code generation, we can try to update the dependency information using the restrictions cloog added and use the polyhedral dependency analysis to check if there are any dependencies in the CLOOG generated loops. So we can add a pass in between CLOOG and clast-to-gimple that marks parallel loops. Advantage: - Can be 100% exact, no forecasts as we are working on actually generated loops. - Nice splitting of what is done where. 1. Graphite is in charge of optimizations (generate parallelism) 2. CodeGen just detects parallel loops and generates code for them. > After Graphite finishes its analysis, it calls autopar's reduction > analysis, and only then the code > generation is called (if the scalar analysis determines that the loop > still parallelizable, of course) > > Once the first step is accomplished, the following steps will focus on > teaching Graphite > to find loop transformations (such as skewing, interchange etc.) that > expose coarse grain synchronization free parallelism. > This will be heavily based on the polyhedral data dependence and > transformation infrastructures. > We have not determined which algorithm/ techniques we're going to use for > this part. > > Having synchronization free parallelization integrated in Graphite, will > set the ground for > handling parallelism requiring a small amount of parallelization. Yes, great. This will allow us to experiment with advanced auto parallelization. I am really looking forward to see the first patches! > This is a rough view for our planned work on autopar in GCC. > Please feel free to ask/comment. > > Thanks, > Razya
Re: cmath call builtin sqrtf but many platforms seem miss that(was Re: lrint lrintf problems )
Hello Richard On 09.03.09, you wrote: >> I believe one should convince the middle end to emit libcall >> for __builtin_xxx when the target has no builtint support. > > It of course does. On what codeplace is the redefine do in GCC source ? I see in my c++config.h file this stand here /* Define if the compiler/host combination has __builtin_sqrtf. */ /* #undef _GLIBCXX_HAVE___BUILTIN_SQRTF */ I find now a solution that work without change of cmath when i add in math.h this line then it work.the function can too stand as static inline in the math.h file. #define __builtin_sqrtf sqrtf > On Mon, Mar 9, 2009 at 3:59 PM, Gabriel Dos Reis > wrote: >> On Mon, Mar 9, 2009 at 7:11 AM, Bernd Roesch wrote: >>> Hello Gabriel >> [...] >>> >>> You see there is the _ not in.normaly funcs that not find have a _ >>> before >>> >>> To get all work, it seem i need add the same function add in math.h and >>> in >>> the linker >>> lib or change cmath file and remove all __builtin_ commands the >>> architecture >>> not have. >> >> I believe one should convince the middle end to emit libcall >> for __builtin_xxx when the target has no builtint support. > > It of course does. > > Richard. Regards
Re: variadic templates supported in non-c++0x mode
On Tue, Mar 10, 2009 at 7:57 AM, Manuel López-Ibáñez wrote: > 2009/3/10 Sylvain Pion : >>> >>> But then probably, variadic templates are implemented as a GCC >>> extension to C++98 and they work fine with -std=c++98 despite what the >>> warning says. Or don't they? >> >> Yes, but like any extension, it's nice to be able to disable them >> as errors, so as to be able to use GCC for checking code portability. > > So use -pedantic-errors as it says in the manual. You should really > use -pedantic-erros if you do not want extensions. Except that -pedantic-errors does not turn only variadic template warnings into errors.
Re: cmath call builtin sqrtf but many platforms seem miss that(was Re: lrint lrintf problems )
On Tue, Mar 10, 2009 at 7:58 AM, Bernd Roesch wrote: > Hello Richard > > On 09.03.09, you wrote: > >>> I believe one should convince the middle end to emit libcall >>> for __builtin_xxx when the target has no builtint support. >> >> It of course does. > > On what codeplace is the redefine do in GCC source ? > > I see in my c++config.h file > > this stand here > > /* Define if the compiler/host combination has __builtin_sqrtf. */ > /* #undef _GLIBCXX_HAVE___BUILTIN_SQRTF */ So, the real problem is that somehow configure thinks your target has support for __builtin_sqrtf. -- Gaby
Re: Setting -frounding-math by default
Joseph S. Myers wrote: On Mon, 9 Mar 2009, Sylvain Pion wrote: Later, 1) started to be taken care of, and it was unfortunately added under the control of the same -frounding-math option. Which now, makes it harder to come back, since we want different defaults for these two aspects. I have already mentioned in a bugzilla PR that it could be nice to have 2 options, but IIRC, I did not get any reply to this. Patches to split the option into two *clearly-defined* options are more likely to be accepted than changing the defaults, given that the fast-math and related flags have been split more than once before. My goal is to have interval arithmetic work with the default flags, without more workarounds in the code (and as efficiently as possible). So, I'm not going to work on anything if it means only splitting it in separate flags, if we don't agree a priori on changing the default for at least one of those sub flags after that. That would be the opposite of progress for my usage, and so I would not volunteer. Currently, typical interval arithmetic code has to work around the fact that there is no good way to stop constant propagation reliably (so it's using some volatile or asm, or a big hammer like rounding-math, all these solutions having a performance cost). It would be nice to improve this, but a global flag, be it dedicated, strikes me as a clearly suboptimal solution here anyway (and, as has been mentioned, it causes problems with code which really needs cprop in other places). For IA, having a __builtin_stop_constant_propagation(expression) would be OK, which would reliably stop constant propagation on an expression, for example. I don't think there's anything like that already in GCC. That would be a real improvement, and would even improve the speed of interval arithmetic implementations as a bonus (because it wouldn't do any harm for the non-constant cases). I'm not sure what you think about such a built-in : it's more general than FP expressions, but may be only useful there in practice (?). Maybe it might even be able to replace most concrete needs for this aspect of -frounding-math. Moreover, I don't know how hard it would be to implement. What do you think? (Of course this would still not be perfect to me : the ideal compiler interface for IA for me is something along the lines of what is described in N2811 which attaches a compile-time rounding mode to the operations. But at least this first step would mean progress, and might be otherwise useful, I don't know.) A quick audit says that the rest of -frounding-math seems indeed like it's about expression transformations which are valid when rounding is to the nearest. Maybe this part can indeed be seen as a subset of -fassociative-math as Paolo mentioned, which may or may not reserve a dedicated flag. (certainly I would push for these transformations to not be performed by default, just like -fassociative-math) FYI, "grep flag_rounding_math" gives the following uses of -frounding-math, with my interpretations : gcc/config/i386/i386.md used to allow the selection of an SSE insn for a round operation when the conditions are sufficiently relaxed (as far as I understood) gcc/config/arm/arm.c triggers an ABI related thing in the ASM files. (I have no clue how -frounding-math could possibly affect the ABI, but I know nothing about ARM) gcc/flags.h affects definition of HONOR_SIGN_DEPENDENT_ROUNDING which is used in many places in fold-const and simplify-rtx for expression transformations. gcc/builtins.def affects definition of ATTR_MATHFN_FPROUNDING which triggers const/pure attributes on "cmath" built-ins it seems. gcc/builtins.c gcc/fold-const.c used for preventing constant folding in some cases gcc/opts.c fast-math => no-rounding-math gcc/convert.c guard (float)-x => -(float)x transformation gcc/simplify-rtx.c guard constant folding I have not checked all this as carefully as can be, but enough to convince me that a good split of -frounding-math might indeed be between (1) stopping constant propagation (where the question is whether a builtin_stop_cprop(e) covers all needs), and (2) "associative-math"-like transformations which could be disabled by default. -- Sylvain Pion INRIA Sophia-Antipolis Geometrica Project-Team CGAL, http://cgal.org/
Re: Setting -frounding-math by default
On Tue, Mar 10, 2009 at 4:45 PM, Sylvain Pion wrote: > - Show quoted text - > Joseph S. Myers wrote: >> >> On Mon, 9 Mar 2009, Sylvain Pion wrote: >> >>> Later, 1) started to be taken care of, and it was unfortunately >>> added under the control of the same -frounding-math option. >>> Which now, makes it harder to come back, since we want different >>> defaults for these two aspects. >>> >>> I have already mentioned in a bugzilla PR that it could be nice >>> to have 2 options, but IIRC, I did not get any reply to this. >> >> Patches to split the option into two *clearly-defined* options are more >> likely to be accepted than changing the defaults, given that the fast-math >> and related flags have been split more than once before. > > My goal is to have interval arithmetic work with the default flags, > without more workarounds in the code (and as efficiently as possible). > So, I'm not going to work on anything if it means only splitting it > in separate flags, if we don't agree a priori on changing the default > for at least one of those sub flags after that. > That would be the opposite of progress for my usage, and so I would > not volunteer. > > > Currently, typical interval arithmetic code has to work around the fact that > there is no good way to stop constant propagation reliably (so it's using > some volatile or asm, or a big hammer like rounding-math, all these > solutions > having a performance cost). > It would be nice to improve this, but a global flag, be it dedicated, > strikes > me as a clearly suboptimal solution here anyway (and, as has been mentioned, > it causes problems with code which really needs cprop in other places). > For IA, having a __builtin_stop_constant_propagation(expression) would be > OK, > which would reliably stop constant propagation on an expression, for > example. > I don't think there's anything like that already in GCC. > That would be a real improvement, and would even improve the speed > of interval arithmetic implementations as a bonus (because it wouldn't > do any harm for the non-constant cases). > I'm not sure what you think about such a built-in : it's more general > than FP expressions, but may be only useful there in practice (?). > Maybe it might even be able to replace most concrete needs for this > aspect of -frounding-math. > Moreover, I don't know how hard it would be to implement. What do you > think? > > (Of course this would still not be perfect to me : the ideal compiler > interface for IA for me is something along the lines of what is described > in N2811 which attaches a compile-time rounding mode to the operations. > But at least this first step would mean progress, and might be otherwise > useful, I don't know.) > > A quick audit says that the rest of -frounding-math seems indeed like it's > about > expression transformations which are valid when rounding is to the nearest. > Maybe this part can indeed be seen as a subset of -fassociative-math > as Paolo mentioned, which may or may not reserve a dedicated flag. > (certainly I would push for these transformations to not be performed > by default, just like -fassociative-math) > > > > FYI, "grep flag_rounding_math" gives the following uses of -frounding-math, > with my interpretations : > > gcc/config/i386/i386.md > > used to allow the selection of an SSE insn for a round operation > when the conditions are sufficiently relaxed (as far as I understood) > > gcc/config/arm/arm.c > > triggers an ABI related thing in the ASM files. > (I have no clue how -frounding-math could possibly > affect the ABI, but I know nothing about ARM) > > gcc/flags.h > > affects definition of HONOR_SIGN_DEPENDENT_ROUNDING > which is used in many places in fold-const and simplify-rtx > for expression transformations. > > gcc/builtins.def > > affects definition of ATTR_MATHFN_FPROUNDING > which triggers const/pure attributes on "cmath" built-ins it seems. > > gcc/builtins.c > gcc/fold-const.c > > used for preventing constant folding in some cases > > gcc/opts.c > > fast-math => no-rounding-math > > gcc/convert.c > > guard (float)-x => -(float)x transformation > > gcc/simplify-rtx.c > > guard constant folding > > > > I have not checked all this as carefully as can be, but enough > to convince me that a good split of -frounding-math might indeed > be between (1) stopping constant propagation (where the question > is whether a builtin_stop_cprop(e) covers all needs), and > (2) "associative-math"-like transformations which could be disabled > by default. The middle-end knows about an explicit association barrier (only used from the Fortran FE sofar), a PAREN_EXPR. Would exposing that to C/C++ be of any help? For example it would, even with -ffast-math, avoid constant folding for (x + FLT_EPS) - FLT_EPS (which FLT_EPS such that proper rounding to the nearest integer value is performed). Richard. > - Show quoted text - > > -- > Sylvain Pion > INRIA Sophia-Antipolis > Geometrica Project-Team > CGAL, http:/
Re: Setting -frounding-math by default
On Tue, Mar 10, 2009 at 9:45 AM, Sylvain Pion wrote: > - Show quoted text - > Joseph S. Myers wrote: >> >> On Mon, 9 Mar 2009, Sylvain Pion wrote: >> >>> Later, 1) started to be taken care of, and it was unfortunately >>> added under the control of the same -frounding-math option. >>> Which now, makes it harder to come back, since we want different >>> defaults for these two aspects. >>> >>> I have already mentioned in a bugzilla PR that it could be nice >>> to have 2 options, but IIRC, I did not get any reply to this. >> >> Patches to split the option into two *clearly-defined* options are more >> likely to be accepted than changing the defaults, given that the fast-math >> and related flags have been split more than once before. > > My goal is to have interval arithmetic work with the default flags, > without more workarounds in the code (and as efficiently as possible). > So, I'm not going to work on anything if it means only splitting it > in separate flags, if we don't agree a priori on changing the default > for at least one of those sub flags after that. > That would be the opposite of progress for my usage, and so I would > not volunteer. > > > Currently, typical interval arithmetic code has to work around the fact that > there is no good way to stop constant propagation reliably (so it's using > some volatile or asm, or a big hammer like rounding-math, all these > solutions > having a performance cost). It is not clear that constant propagation is the evil that needs to be stopped at all cost. Remember, there is lot under the heading 'constant propagation'. > It would be nice to improve this, but a global flag, be it dedicated, > strikes > me as a clearly suboptimal solution here anyway (and, as has been mentioned, > it causes problems with code which really needs cprop in other places). > For IA, having a __builtin_stop_constant_propagation(expression) would be > OK, I'm not too sure you really want this, or anybody serious about scientific computations and performance really wants that. And how would you reconcile that with constexpr? -- Gaby
Re: Setting -frounding-math by default
Gabriel Dos Reis wrote: On Tue, Mar 10, 2009 at 9:45 AM, Sylvain Pion wrote: - Show quoted text - Joseph S. Myers wrote: On Mon, 9 Mar 2009, Sylvain Pion wrote: Later, 1) started to be taken care of, and it was unfortunately added under the control of the same -frounding-math option. Which now, makes it harder to come back, since we want different defaults for these two aspects. I have already mentioned in a bugzilla PR that it could be nice to have 2 options, but IIRC, I did not get any reply to this. Patches to split the option into two *clearly-defined* options are more likely to be accepted than changing the defaults, given that the fast-math and related flags have been split more than once before. My goal is to have interval arithmetic work with the default flags, without more workarounds in the code (and as efficiently as possible). So, I'm not going to work on anything if it means only splitting it in separate flags, if we don't agree a priori on changing the default for at least one of those sub flags after that. That would be the opposite of progress for my usage, and so I would not volunteer. Currently, typical interval arithmetic code has to work around the fact that there is no good way to stop constant propagation reliably (so it's using some volatile or asm, or a big hammer like rounding-math, all these solutions having a performance cost). It is not clear that constant propagation is the evil that needs to be stopped at all cost. Remember, there is lot under the heading 'constant propagation'. It would be nice to improve this, but a global flag, be it dedicated, strikes me as a clearly suboptimal solution here anyway (and, as has been mentioned, it causes problems with code which really needs cprop in other places). For IA, having a __builtin_stop_constant_propagation(expression) would be OK, I'm not too sure you really want this, or anybody serious about scientific computations and performance really wants that. And how would you reconcile that with constexpr? I agree. If you are looking for the Right [tm] solution, please take a look at N2811. -- Sylvain Pion INRIA Sophia-Antipolis Geometrica Project-Team CGAL, http://cgal.org/
Re: cmath call builtin sqrtf but many platforms seem miss that(was Re: lrint lrintf problems )
Bernd Roesch writes: > Hello Richard > > On 09.03.09, you wrote: > >>> I believe one should convince the middle end to emit libcall >>> for __builtin_xxx when the target has no builtint support. >> >> It of course does. > > On what codeplace is the redefine do in GCC source ? This is in optabs.c, as set up by gen_libfunc and friends. A call to __builtin_xxx, where xxx is a library function, is normally replaced by a call to xxx. Where xxx is not a library function, gcc normally provides the function in libgcc. Ian
Re: Setting -frounding-math by default
Richard Guenther wrote: The middle-end knows about an explicit association barrier (only used from the Fortran FE sofar), a PAREN_EXPR. Would exposing that to C/C++ be of any help? For example it would, even with -ffast-math, avoid constant folding for (x + FLT_EPS) - FLT_EPS (which FLT_EPS such that proper rounding to the nearest integer value is performed). Off the top of my head, I don't see anything both really useful and not-surprising in C/C++ here, but I may well miss something. There are some C++0x features, axiom and constexpr, which are related to this area. It's not clear yet to me how all this will interact, but I have good hope to see some connections there (like modeling some associativity rules with axioms). -- Sylvain Pion INRIA Sophia-Antipolis Geometrica Project-Team CGAL, http://cgal.org/
How to use a scratch register in "jump" pattern ?
Hi, I need to use a scratch register in the "jump" pattern and I can't figure out how to do it properly. My problem is that the microcontroller I'm porting gcc onto does not permit "far" jumps, those must be done using an indirect adressing. So I wrote this: -8<--8<- (define_attr "length" "" (const_int 2)) (define_insn "*jump" [(set (pc) (label_ref (match_operand 0 "" ""))) (clobber (match_scratch:QI 1 "=r"))] "" { if (get_attr_length (insn) == 1) return "rjmp %0"; else return "ldih %1,hi(%l0)\n\t\n\tldil %%1,lo(%l0)\n\tijmp %(%1)"; } [(set (attr "length") (if_then_else (and (ge (minus (match_dup 0) (pc)) (const_int -2048)) (le (minus (match_dup 0) (pc)) (const_int 2047))) (const_int 1) (const_int 2)))] ) (define_expand "jump" [(set (pc) (label_ref (match_operand 0 "" "")))] "" "" ) -8<--8<- But it doesn't work: ... (jump_insn 44 266 45 6 /tmp/src/gcc-4.3.1/libgcc/../gcc/libgcov.c:137 (set (pc) (label_ref 119)) -1 (nil)) /tmp/src/gcc-4.3.1/libgcc/../gcc/libgcov.c:577: internal compiler error: in extract_insn, at recog.c:1990 Please submit a full bug report, with preprocessed source if appropriate. Any idea ? Side question regarding the "length" attribute. It seems to work ok, but if I change the default value (first line in my example) to be 'const_int 1', I later get 'operand out of range' from the assembler because the 'rjmp' instruction was used for deplacements bigger than 2048. How can this happen, since my '(set (attr "length")' code explicitly sets the correct value each time ? Thanks, -- Stelian Pop
Re: Setting -frounding-math by default
Sylvain Pion wrote: > Joseph S. Myers wrote: >> On Mon, 9 Mar 2009, Sylvain Pion wrote: >> >>> Later, 1) started to be taken care of, and it was unfortunately >>> added under the control of the same -frounding-math option. >>> Which now, makes it harder to come back, since we want different >>> defaults for these two aspects. >>> >>> I have already mentioned in a bugzilla PR that it could be nice >>> to have 2 options, but IIRC, I did not get any reply to this. >> >> Patches to split the option into two *clearly-defined* options are >> more likely to be accepted than changing the defaults, given that the >> fast-math and related flags have been split more than once before. > > My goal is to have interval arithmetic work with the default flags, > without more workarounds in the code (and as efficiently as possible). > So, I'm not going to work on anything if it means only splitting it > in separate flags, if we don't agree a priori on changing the default > for at least one of those sub flags after that. We know that's what you want. What we don't know (well, what I don't know) is *why*. If you want to do something as specialized as interval arithmetic, what's the big deal with having to pass special flags to the compiler? Andrew.
Re: How to use a scratch register in "jump" pattern ?
Stelian Pop wrote: > I need to use a scratch register in the "jump" pattern and I can't > figure out how to do it properly. > > My problem is that the microcontroller I'm porting gcc onto does not > permit "far" jumps, those must be done using an indirect adressing. > (define_insn "*jump" > Any idea ? You should be able to define/use the "indirect_jump" pattern instead? Take a look at the last paragraph of the the docs for the "call" pattern as well. I don't think it'll work to have a jump insn that is sometimes direct, sometimes indirect. Looking at how the rs6000 backend handles direct and indirect calls and jumps might give you some inspiration too. cheers, DaveK
Re: variadic templates supported in non-c++0x mode
-pedantic-errors will make it an error. I don't feel strongly about whether these should be pedwarn or something stronger, but I note that libstdc++ wants to use variadic templates in the default mode, so we can't just disable them entirely. Jason
Re: Setting -frounding-math by default
On Tue, Mar 10, 2009 at 10:31 AM, Sylvain Pion wrote: [...] > If you are looking for the Right [tm] solution, please take a look at N2811. I'm familiar with that paper -- I should have made that disclosure in my previous message. -- Gaby
Re: Setting -frounding-math by default
On Tue, Mar 10, 2009 at 10:41 AM, Sylvain Pion wrote: > Richard Guenther wrote: >> >> The middle-end knows about an explicit association barrier (only >> used from the Fortran FE sofar), a PAREN_EXPR. Would exposing >> that to C/C++ be of any help? For example it would, even with >> -ffast-math, avoid constant folding for (x + FLT_EPS) - FLT_EPS >> (which FLT_EPS such that proper rounding to the nearest integer >> value is performed). > > Off the top of my head, I don't see anything both really useful > and not-surprising in C/C++ here, but I may well miss something. > > There are some C++0x features, axiom and constexpr, which are > related to this area. It's not clear yet to me how all this > will interact, but I have good hope to see some connections > there (like modeling some associativity rules with axioms). Note that axioms do not change the behaviour of a program -- assuming the program contains `no bugs'. -- Gaby
Re: How to use a scratch register in "jump" pattern ?
On Tue, Mar 10, 2009 at 05:20:28PM +, Dave Korn wrote: > Stelian Pop wrote: > > > I need to use a scratch register in the "jump" pattern and I can't > > figure out how to do it properly. > > > > My problem is that the microcontroller I'm porting gcc onto does not > > permit "far" jumps, those must be done using an indirect adressing. > > > (define_insn "*jump" > > > Any idea ? > > You should be able to define/use the "indirect_jump" pattern instead? Take > a look at the last paragraph of the the docs for the "call" pattern as well. > I already do have an "indirect_jump" pattern in my md file: (define_insn "indirect_jump" [(set (pc) (match_operand:QI 0 "register_operand" "r"))] "" "ijmp (%0)" [(set_attr "cc" "none")] ) However, I didn't find a way to say to gcc that it cannot use the "direct" jump in some cases, and force it to use the "indirect_jump" instead. > I don't think it'll work to have a jump insn that is sometimes direct, > sometimes indirect. > Looking at how the rs6000 backend handles direct and indirect calls and > jumps might give you some inspiration too. Well, I already did look at all the backends, but didn't find the answer (although the answer probably is hidden somewhere inside those files...). As for the rs6000 backend I see a simple direct call pattern, not sure what you want me to look at: in rs6000.md: (define_insn "jump" [(set (pc) (label_ref (match_operand 0 "" "")))] "" "b %l0" [(set_attr "type" "branch")]) Thanks, Stelian. -- Stelian Pop
Re: Setting -frounding-math by default
Andrew Haley wrote: We know that's what you want. What we don't know (well, what I don't know) is *why*. If you want to do something as specialized as interval arithmetic, what's the big deal with having to pass special flags to the compiler? I contest the "as specialized as" comment. I know that I may look like Don Quixote here, but you may imagine that many people look like Panurge's sheeps to me on this point ;-) Interval arithmetic is not supposed to be an obscure feature, it is a way to compute with real numbers which offers many advantages compared to the usual floating-point (FP) model (which is quite well supported by hardware and compilers). It directly competes with it on this ground (which means the potential market is huge), and is easily based on it for the implementation. Some of the reasons why it is not used as much as it could, are that it has poor support from compilers, and could have better support from the hardware as well (with proper hardware support, interval operations would roughly be just as fast as floating-point, and adequate hardware support is not hard fundamentally). All this induces a tradition of training and education which puts FP first, and IA in a vicious circle : no hardware/software improvement => unfair comparison => no teaching => no demand => no improvement... It would also benefit from standardization, but this is being taken care of (see the ongoing IEEE-1788, and the std::interval proposal for C++/TR2). As I said in another mail, if you have code which uses interval arithmetic deep down an application, but it happens to come from, say, Boost.Interval, which is a library providing inline functions for efficiency reasons (not precompiled), then you need to pass these flags when compiling the whole application (the translation units that include those headers) as well. And that's a pain, especially with all those template/modern libraries which have everything in headers. That's a concrete problem I want to solve that affects my pet library (see my signature) and its users. Now, what's so good about intervals that nobody sees? I think that nobody sees the cost of floating-point, which is that it is a complicated model of approximation of reals, which forces tons of people to learn details about it, while all they want is to compute with real numbers. You need to teach all beginners the pitfalls of it, this is one of the first things you need to do. Doing the same thing with intervals would be much easier to teach and give strong guarantees on results which everybody mastering real numbers would understand easily. Concrete example : you have heard that moto "never compare FP for equality", which some people try to have beginners learn, while experts know the reality behind it, which is : it depends what you do (meaning : learn more). With intervals, you simply get a clean semantic : true, false, or "I don't know", which tells you that you have to do something special. No surprise. Everything is rock solid, and you don't need endless discussions around "should I use -ffast-math in my code or which subset of its sub-flags is best...". Admittedly, filling the "I don't know" can require some work, but it is clear where work is needed and you may decide to ignore it (for -ffast-math users :-) ). This was an argument for the cost of teaching to "masses" (aka "beginners", which in fact already master real numbers maths, and which naively expect computers to be a tool that helps them instead of a tool that hurts their brain). For advanced scientific computing, if you had intervals as fast as floating-point, then a lot of complicated work on, say, static code analysis for roundoff error propagation (the parts of it which try to emulate IA with FP by computing bounds at compile-time), would become pointless as the hardware would take care of it. Also, even beyond, my guess is that formal proofs of algorithms and programs dealing with real numbers would be much simplified if they based their models on intervals rather than floating-point. Try to evaluate the global educational cost of producing experts in this area ? My point about improving compiler support first, is that I see it as a first step to help reaching a critical mass of applications and users, in order to economically justify hardware support. (at which point compiler support will be trivial, but unfortunately we are not there yet. It's like we had to have FP-emulators in the past.) I may be wrong about that. I don't mean IA is perfect nor magic and solving everything without thinking : certainly convergence and stability issues are similar than with FP. But, IMO, it's an improvement over FP which should be considered on the same ground as the improvement that FP has been over integers 2-3 decades ago. Now, you know the *why*. I'm not sure whether I convinced you, but I'd be glad to have some help for the *how* ;-) -- Sylvain Pion INRIA Sophia-Antipolis Geometrica Project-Team CGAL,
Re: Setting -frounding-math by default
Gabriel Dos Reis wrote: On Tue, Mar 10, 2009 at 10:41 AM, Sylvain Pion wrote: Richard Guenther wrote: The middle-end knows about an explicit association barrier (only used from the Fortran FE sofar), a PAREN_EXPR. Would exposing that to C/C++ be of any help? For example it would, even with -ffast-math, avoid constant folding for (x + FLT_EPS) - FLT_EPS (which FLT_EPS such that proper rounding to the nearest integer value is performed). Off the top of my head, I don't see anything both really useful and not-surprising in C/C++ here, but I may well miss something. There are some C++0x features, axiom and constexpr, which are related to this area. It's not clear yet to me how all this will interact, but I have good hope to see some connections there (like modeling some associativity rules with axioms). Note that axioms do not change the behaviour of a program -- assuming the program contains `no bugs'. I don't mean to change the behavior of a program without changing its code. But you can change its code so that it describes whether a particular function can support associativity transformations. (then it's up to the compiler to realize what the axiom is about, but that's another story : you can imagine a simple, non-general implementation, recognizing standard axioms as "built-ins", and trigger the corresponding -fassociative-math on the functions which are constrained by this concept+axiom) One way I could imagine to have axioms and associativity rules on FP interact in a program would be to write a wrapper type, say std::associative, which would just wrap and forward the operations to double, but which would, in addition, have a concept_map for a concept whose axioms would specify the associativity rules. Then, if you call a function which you know is OK if the compiler does the associativity transformations, you can just call it with std::associative arguments. Other ways might be possible. E.g. avoiding the wrapping if the associativity concept/axiom is somehow local to the function and does not leak elsewhere if you provide an Associativity concept_map for double (or are "scoped concept maps" about this issue ? I should learn about this.). -- Sylvain Pion INRIA Sophia-Antipolis Geometrica Project-Team CGAL, http://cgal.org/
Re: Setting -frounding-math by default
On Tue, Mar 10, 2009 at 12:27 PM, Sylvain Pion wrote: > Other ways might be possible. E.g. avoiding the wrapping if > the associativity concept/axiom is somehow local to the function > and does not leak elsewhere if you provide an Associativity > concept_map for double (or are "scoped concept maps" about > this issue ? I should learn about this.). A general idea behind axioms is that they hold for the values and archetypes in a generic function -- even if globally they may not be universally true. That is the local aspect of axioms. Of course, you can make (global) concept_maps too. Anyway, I think you would NOT want the compiler to disable useful optimizations globally, but only where some transformations may not be appropriate. -- Gaby
The Linux binutils 2.19.51.0.3 is released
This is the beta release of binutils 2.19.51.0.3 for Linux, which is based on binutils 2009 0310 in CVS on sourceware.org plus various changes. It is purely for Linux. All relevant patches in patches have been applied to the source tree. You can take a look at patches/README to see what have been applied and in what order they have been applied. Starting from the 2.18.50.0.4 release, the x86 assembler no longer accepts fnstsw %eax fnstsw stores 16bit into %ax and the upper 16bit of %eax is unchanged. Please use fnstsw %ax Starting from the 2.17.50.0.4 release, the default output section LMA (load memory address) has changed for allocatable sections from being equal to VMA (virtual memory address), to keeping the difference between LMA and VMA the same as the previous output section in the same region. For .data.init_task : { *(.data.init_task) } LMA of .data.init_task section is equal to its VMA with the old linker. With the new linker, it depends on the previous output section. You can use .data.init_task : AT (ADDR(.data.init_task)) { *(.data.init_task) } to ensure that LMA of .data.init_task section is always equal to its VMA. The linker script in the older 2.6 x86-64 kernel depends on the old behavior. You can add AT (ADDR(section)) to force LMA of .data.init_task section equal to its VMA. It will work with both old and new linkers. The x86-64 kernel linker script in kernel 2.6.13 and above is OK. The new x86_64 assembler no longer accepts monitor %eax,%ecx,%edx You should use monitor %rax,%ecx,%edx or monitor which works with both old and new x86_64 assemblers. They should generate the same opcode. The new i386/x86_64 assemblers no longer accept instructions for moving between a segment register and a 32bit memory location, i.e., movl (%eax),%ds movl %ds,(%eax) To generate instructions for moving between a segment register and a 16bit memory location without the 16bit operand size prefix, 0x66, mov (%eax),%ds mov %ds,(%eax) should be used. It will work with both new and old assemblers. The assembler starting from 2.16.90.0.1 will also support movw (%eax),%ds movw %ds,(%eax) without the 0x66 prefix. Patches for 2.4 and 2.6 Linux kernels are available at http://www.kernel.org/pub/linux/devel/binutils/linux-2.4-seg-4.patch http://www.kernel.org/pub/linux/devel/binutils/linux-2.6-seg-5.patch The ia64 assembler is now defaulted to tune for Itanium 2 processors. To build a kernel for Itanium 1 processors, you will need to add ifeq ($(CONFIG_ITANIUM),y) CFLAGS += -Wa,-mtune=itanium1 AFLAGS += -Wa,-mtune=itanium1 endif to arch/ia64/Makefile in your kernel source tree. Please report any bugs related to binutils 2.19.51.0.3 to hjl.to...@gmail.com and http://www.sourceware.org/bugzilla/ Changes from binutils 2.19.51.0.2: 1. Update from binutils 2009 0310. 2. Fix strip on common symbols in relocatable file. PR 9933. 3. Fix --enable-targets=all build. 4. Fix ia64 build with -Wformat-security. PR 9874. 5. Add REGION_ALIAS support in linker script. 6. Add think archive support to readelf. 7. Improve DWARF support in objdump. 8. Improve alpha support. 9. Improve arm support. 10. Improve hppa support. 11. Improve m68k support. 12. Improve mips support. 13. Improve ppc support. 14. Improve xtensa support. 15. Add score 7 support. Changes from binutils 2.19.51.0.1: 1. Update from binutils 2009 0204. 2. Support AVX Programming Reference (January, 2009) 3. Improve .s suffix support in x86 disassembler. 4. Add --prefix/--prefix-strip for objdump -S. PR 9784. 5. Change "ld --as-needed" to resolve undefined references in DSO. 6. Add -Ttext-segment to ld to set address of text segment. 7. Fix "ld -r --gc-sections --entry" crash with COMDAT group. PR 9727. 8. Improve linker compatibility for g++ 3.4 `.gnu.linkonce.r.*. 9. Add VMS/ia64 support. 10. Improve arm support. 11. Improve cris support. 12. Improve m68k support. 13. Improve mips support. 14. Improve spu support. Changes from binutils 2.19.50.0.1: 1. Update from binutils 2009 0106. 2. Support AVX Programming Reference (December, 2008) 2. Encode AVX insns with 2byte VEX prefix if possible. 4. Add .s suffix support to swap register operands to x86 assembler. 5. Properly select NOP insns for code alignment in x86 assembler. 6. Fix 2 symbol visibility linker bugs. PRs 9676/9679. 7. Fix an ia64 linker relaxation bug. PR 7036. 8. Fix a symbol versioning bug. PR 7047. 9. Fix unitialized data in linker. PR 7028. 10. Avoid a linker crash on bad input. PR 7023. 11. Fix a linker memory leak. PR 7012. 12. Fix strip/objcopy crash on PT_GNU_RELRO. PR 7011. 13. Improve MacOS support. 14. Fix a COFF linker bug. PR 6945. 15. Add LM32 support. 16. Fix various arm bugs. 17. Fix various avr bugs. 18. Fix various CR16 bugs. 19. Fix various cris bugs. 20. Fix various m32c bugs. 21. Fix various m68k bugs. 22. Fix various mips bugs. 23. Fix va
[RFC] Unused variable in profile.c
I accidentally found that the local variable num_never_executed in function compute_branch_probabilities, is initialized to zero and never gets modified after that. I suppose the statement in line 603: num_branches++, num_never_executed; was intended to be: num_branches++, num_never_executed++; Right ? Thanks, Edmar
Re: [RFC] Unused variable in profile.c
On Tue, Mar 10, 2009 at 12:58 PM, Edmar Wienskoski wrote: > I accidentally found that the local variable > num_never_executed > in function compute_branch_probabilities, is initialized to zero > and never gets modified after that. > > I suppose the statement in line 603: > num_branches++, num_never_executed; > was intended to be: > num_branches++, num_never_executed++; > > Right ? Looks that way. It is only used for dump files so it will not change the code generation and looks like it is obvious change. Thanks, Andrew Pinski
[c++0x] DR 387 implementation in incomplete.
Hello, It seems that the implementation of DR 387 in trunk's is incomplete: both versions of operator+,-(const _Tp&) still assume real() is an lvalue (see below). These are the only offenders , AFAICS. Is somebody on this, or should I file a PR? Regards, Jan. The program: #include typedef std::complex C; C f1(C& c) { return c+1.0; } C f2(C& c) { return c-1.0; } C f3(C& c) { return 1.0+c; } C f4(C& c) { return 1.0-c; } $ g++ -std=c++0x -c t.cpp In file included from t.cpp:1: /home/jan/local/gcc-head/lib/gcc/x86_64-unknown-linux-gnu/4.4.0/../../../../include/c++/4.4.0/complex: In function ‘std::complex<_Tp> std::operator+(const std::complex<_Tp>&, const _Tp&) [with _Tp = double]’: t.cpp:3: instantiated from here /home/jan/local/gcc-head/lib/gcc/x86_64-unknown-linux-gnu/4.4.0/../../../../include/c++/4.4.0/complex:336: error: lvalue required as left operand of assignment /home/jan/local/gcc-head/lib/gcc/x86_64-unknown-linux-gnu/4.4.0/../../../../include/c++/4.4.0/complex: In function ‘std::complex<_Tp> std::operator-(const std::complex<_Tp>&, const _Tp&) [with _Tp = double]’: t.cpp:4: instantiated from here /home/jan/local/gcc-head/lib/gcc/x86_64-unknown-linux-gnu/4.4.0/../../../../include/c++/4.4.0/complex:366: error: lvalue required as left operand of assignment /home/jan/local/gcc-head/lib/gcc/x86_64-unknown-linux-gnu/4.4.0/../../../../include/c++/4.4.0/complex: In function ‘std::complex<_Tp> std::operator+(const _Tp&, const std::complex<_Tp>&) [with _Tp = double]’: t.cpp:5: instantiated from here /home/jan/local/gcc-head/lib/gcc/x86_64-unknown-linux-gnu/4.4.0/../../../../include/c++/4.4.0/complex:345: error: lvalue required as left operand of assignment /home/jan/local/gcc-head/lib/gcc/x86_64-unknown-linux-gnu/4.4.0/../../../../include/c++/4.4.0/complex: In function ‘std::complex<_Tp> std::operator-(const _Tp&, const std::complex<_Tp>&) [with _Tp = double]’: t.cpp:6: instantiated from here /home/jan/local/gcc-head/lib/gcc/x86_64-unknown-linux-gnu/4.4.0/../../../../include/c++/4.4.0/complex:375: error: lvalue required as left operand of assignment
Re: variadic templates supported in non-c++0x mode
2009/3/10 Sylvain Pion: > Manuel López-Ibáñez wrote: >> 2009/3/10 Sylvain Pion: >>> >>> Yes, but like any extension, it's nice to be able to disable them >>> as errors, so as to be able to use GCC for checking code portability. >> >> So use -pedantic-errors as it says in the manual. You should really >> use -pedantic-erros if you do not want extensions. > > Sure. It just forces you to have additional compiler-specific flags > in your configuration system. I don't think that's unreasonable if you want to use GCC for checking portability. The primary use case is as the GNU compiler, not a portability checker. If you're arguing for GCC to disable all extensions by default that's another topic, and a contentious one :-) Jonathan
Re: How to use a scratch register in "jump" pattern ?
Stelian Pop schrieb: Hi, I need to use a scratch register in the "jump" pattern and I can't figure out how to do it properly. My problem is that the microcontroller I'm porting gcc onto does not permit "far" jumps, those must be done using an indirect adressing. So I wrote this: -8<--8<- (define_attr "length" "" (const_int 2)) (define_insn "*jump" [(set (pc) (label_ref (match_operand 0 "" ""))) (clobber (match_scratch:QI 1 "=r"))] "" { if (get_attr_length (insn) == 1) return "rjmp %0"; else return "ldih %1,hi(%l0)\n\t\n\tldil %%1,lo(%l0)\n\tijmp %(%1)"; } [(set (attr "length") (if_then_else (and (ge (minus (match_dup 0) (pc)) (const_int -2048)) (le (minus (match_dup 0) (pc)) (const_int 2047))) (const_int 1) (const_int 2)))] ) (define_expand "jump" [(set (pc) (label_ref (match_operand 0 "" "")))] "" "" ) -8<--8<- But it doesn't work: ... (jump_insn 44 266 45 6 /tmp/src/gcc-4.3.1/libgcc/../gcc/libgcov.c:137 (set (pc) (label_ref 119)) -1 (nil)) /tmp/src/gcc-4.3.1/libgcc/../gcc/libgcov.c:577: internal compiler error: in extract_insn, at recog.c:1990 Please submit a full bug report, with preprocessed source if appropriate. Any idea ? Note that no one is generating an insn that looks like "*jump". Maybe insn combine and peep2 would try to build such a pattern, but that is not what helpd you out here. Write the expander as parallel of a (set (pc) ...) and a (clobber (match_scratch ...)) so that an appropriate insn is expanded. Also note that "*jump" is an (implicit) parallel. As constraint for the "*jump" insn use an "X" in the case you do not need the clobber reg and sth. like "=&r" in the case you really need it. Side question regarding the "length" attribute. It seems to work ok, but if I change the default value (first line in my example) to be 'const_int 1', I later get 'operand out of range' from the assembler because the 'rjmp' instruction was used for deplacements bigger than 2048. How can this happen, since my '(set (attr "length")' code explicitly sets the correct value each time ? You set the length explicitely, so the default does not matter. Georg-Johann
Re: How to use a scratch register in "jump" pattern ?
On Tue, Mar 10, 2009 at 10:18:10PM +0100, Georg-Johann Lay wrote: > Note that no one is generating an insn that looks like "*jump". Maybe > insn combine and peep2 would try to build such a pattern, but that is > not what helpd you out here. Write the expander as parallel of a (set > (pc) ...) and a (clobber (match_scratch ...)) so that an appropriate > insn is expanded. Like this ? (define_expand "jump" [(parallel [(set (pc) (label_ref (match_operand 0 "" ""))) (clobber (match_scratch:QI 1 "=&r"))])] "" "" ) > Also note that "*jump" is an (implicit) parallel. As > constraint for the "*jump" insn use an "X" in the case you do not need > the clobber reg and sth. like "=&r" in the case you really need it. Ok, but the decision on if I need the clobber reg or not is based on the 'length' attribute. So if I could write the following, but where I can put the calculation and the test of the 'length' attribute: (define_insn "*jump_internal" [(set (pc) (label_ref (match_operand 0 "" ""))) (clobber (match_scratch:QI 1 "X,=&r"))] "" "@ rjmp %0 ldih %1,hi(%l0)\n\t\n\tldil %1,lo(%l0)\n\tijmp (%1)" ) >> Side question regarding the "length" attribute. It seems to work ok, but >> if I change the default value (first line in my example) to be 'const_int 1', >> I later get 'operand out of range' from the assembler because the 'rjmp' >> instruction was used for deplacements bigger than 2048. How can this happen, >> since my '(set (attr "length")' code explicitly sets the correct value >> each time ? > > You set the length explicitely, so the default does not matter. Yes, this was exactly my point. It shouldn't matter, but it does, because it does different things when I change the default value. Stelian. -- Stelian Pop
softfloat symbol visibility in libgcc.a/libgcc_s.so (fp-bit/dp-bit)
if working with a softfloat toolchain, we end up with copies of softfloat symbols everywhere (from fp-bit.c and dp-bit.c). should these files really end up with symbols with hidden visibility ? seems like a waste to force copying of these symbols into binaries when libgcc_s.so itself already has a copy of them, and when dealing with softfloat toolchains, most things need this library loaded anyways. this behavior can be seen with bfin-linux-uclibc and sh4-linux-gnu toolchains. $ bfin-linux-uclibc-readelf -s libgcc.a | grep unpack 14: 152 FUNCGLOBAL HIDDEN1 ___unpack_f 14: 198 FUNCGLOBAL HIDDEN1 ___unpack_d $ bfin-linux-uclibc-readelf -s libgcc_s.so | grep unpack 244: 6a44 198 FUNCLOCAL DEFAULT 10 ___unpack_d 250: 5ef4 152 FUNCLOCAL DEFAULT 10 ___unpack_f then looking a typical Linux build, we see these symbols being copied into the C library and many basic libraries (ncurses/ts/etc...) as well as programs (shell/etc...). even though they're also linked against libgcc_s.so, the libgcc.a archive provided the symbols since they werent exported from libgcc_s.so. perhaps we need to extend the libgcc.map function to allow people to insert $(FPBIT_FUNCS) and such into the map so libgcc_s.so exports these suckers ? -mike signature.asc Description: This is a digitally signed message part.
Re: [c++0x] DR 387 implementation in incomplete.
Hi, > Is somebody on this, or should I file a PR? > I'll fix it, thanks for your report. Paolo. PS: in the future remember to also CC libstdc++, to be sure.
Re: softfloat symbol visibility in libgcc.a/libgcc_s.so (fp-bit/dp-bit)
On Tue, 10 Mar 2009, Mike Frysinger wrote: > perhaps we need to extend the libgcc.map function to allow people to insert > $(FPBIT_FUNCS) and such into the map so libgcc_s.so exports these suckers ? Exporting functions that are internal to fp-bit rather than part of the documented libgcc interface has the disadvantage that you then need to keep them around when targets change to other software floating-point implementations (such as soft-fp, which is significantly faster than fp-bit, or assembly implementations for particular targets). Perhaps you might like to convert your target to soft-fp to avoid the problem. fp-bit has slightly smaller code size and supports 16-bit targets which soft-fp may not, but for any 32-bit or 64-bit target a few kB more in libgcc shouldn't be significant and the speed improvements are substantial. (If you want exception and rounding mode support for software floating-point, soft-fp can do that as well, but you can also configure it with that support disabled.) -- Joseph S. Myers jos...@codesourcery.com
Re: softfloat symbol visibility in libgcc.a/libgcc_s.so (fp-bit/dp-bit)
On Tuesday 10 March 2009 21:44:23 Joseph S. Myers wrote: > On Tue, 10 Mar 2009, Mike Frysinger wrote: > > perhaps we need to extend the libgcc.map function to allow people to > > insert $(FPBIT_FUNCS) and such into the map so libgcc_s.so exports these > > suckers ? > > Exporting functions that are internal to fp-bit rather than part of the > documented libgcc interface has the disadvantage that you then need to > keep them around when targets change to other software floating-point > implementations (such as soft-fp, which is significantly faster than > fp-bit, or assembly implementations for particular targets). > > Perhaps you might like to convert your target to soft-fp to avoid the > problem. fp-bit has slightly smaller code size and supports 16-bit > targets which soft-fp may not, but for any 32-bit or 64-bit target a few > kB more in libgcc shouldn't be significant and the speed improvements are > substantial. (If you want exception and rounding mode support for > software floating-point, soft-fp can do that as well, but you can also > configure it with that support disabled.) you're saying the ABI of soft-fp is stable and is part of the export libgcc_s.so interface ? -mike signature.asc Description: This is a digitally signed message part.
Re: softfloat symbol visibility in libgcc.a/libgcc_s.so (fp-bit/dp-bit)
On Tue, 10 Mar 2009, Mike Frysinger wrote: > On Tuesday 10 March 2009 21:44:23 Joseph S. Myers wrote: > > On Tue, 10 Mar 2009, Mike Frysinger wrote: > > > perhaps we need to extend the libgcc.map function to allow people to > > > insert $(FPBIT_FUNCS) and such into the map so libgcc_s.so exports these > > > suckers ? > > > > Exporting functions that are internal to fp-bit rather than part of the > > documented libgcc interface has the disadvantage that you then need to > > keep them around when targets change to other software floating-point > > implementations (such as soft-fp, which is significantly faster than > > fp-bit, or assembly implementations for particular targets). > > > > Perhaps you might like to convert your target to soft-fp to avoid the > > problem. fp-bit has slightly smaller code size and supports 16-bit > > targets which soft-fp may not, but for any 32-bit or 64-bit target a few > > kB more in libgcc shouldn't be significant and the speed improvements are > > substantial. (If you want exception and rounding mode support for > > software floating-point, soft-fp can do that as well, but you can also > > configure it with that support disabled.) > > you're saying the ABI of soft-fp is stable and is part of the export > libgcc_s.so interface ? Yes. Both fp-bit and soft-fp provide functions documented under "Soft float library routines" in libgcc.texi. These are part of the public exported libgcc_s.so interface. Some such functions are also provided by libgcc2.c. fp-bit does not provide those; soft-fp can but doesn't have to, depending on the setting of softfp_exclude_libgcc2. (You want softfp_exclude_libgcc2 := n, unless you have both hard-float and soft-float multilibs, when setting it to y avoids hard-float multilibs using soft-fp implementations of these functions. The soft-fp implementations are expected to be better for soft-float (the operations can be carried out directly rather than wrapping other operations), but worse for hard-float (wrapping a related hard-float operation is better than implementing it all in soft-float).) soft-fp does not have any functions other than those forming part of the libgcc interface; everything is inlined in those functions. fp-bit has a number of internal helper functions built into separate objects. It's these helper functions you were seeing exported only from the static libgcc, as hidden functions. Because they are not part of the libgcc interface they should not be exported from the shared libgcc. I don't know why shared libraries linked with the shared libgcc should get hidden copies of these from the static libgcc (especially since I didn't think we linked shared libraries with static libgcc at all at present - this is a bug, they do sometimes need to be able to get static-only symbols from libgcc). C executables link with static libgcc by default rather than shared libgcc. -- Joseph S. Myers jos...@codesourcery.com
Re: bitfields: types vs modes?
On Tue, 10 Mar 2009, DJ Delorie wrote: > One of our customers has a chip with memory-mapped peripheral > registers that need to be accessed in a specific mode. The registers > represent bitfields within the hardware, so a volatile struct is an > obvious choice to represent them in C. Thank you for those words. ;) (There was a "minor argument" on IRC some time ago on a related subject.) > However, gcc has a very simplistic heuristic for deciding what mode to > use to access bitfields in structures - it uses either the biggest or > smallest mode practical. This offers the programmer no way to tell > gcc what mode the accesses need to be in, aside from manually > reading/writing memory with integer types and casting. And not even *then* are you guaranteed the intended semantics as per the documentation (though at least each target should be able to promise that, IMNSHO). IMHO GCC should have the means to do what you suggest for *all* targets; memory-mapped I/O that needs to be accessed in a specific mode is a general-enough situation. > Options? My thought, after some internal discussion, is that (if the > target chooses) we allow gcc to honor the type of a volatile bitfield > as the mode as long as it can do so without otherwise violating the > structure's layout. Some new hook will be needed for the backend, and > perhaps a -W option for when the type cannot be honored. > > I.e. if the programmer is careful enough to properly lay out the > struct, the programmer should get what the programmer asks for. Can you provide example code? I'm confused enough to believe that you *should* get this effect with PCC_BITFIELD_TYPE_MATTERS (modulo current bugs). > Comments? The concept is certainly agreeable. I'd recommend -fno-tree-sra when you inspect current behavior; there be bugs there. Sorry, no PR yet. brgds, H-P
Re: bitfields: types vs modes?
> Can you provide example code? I'm confused enough to believe > that you *should* get this effect with PCC_BITFIELD_TYPE_MATTERS > (modulo current bugs). Imagine a device with four 8-bit registers followed by a 32-bit register with an 8-bit field: bytestatus (read-only, clears after reading) bytecontrol (read/write) bytetx buf (write only) byterx buf (read only) longuart configuration divisor:8 bits:3 stop:1 start:1 reserved:3 clock source:8 pin_selection:8 so, you'd do this: typedef struct { char status:8; char control:8; char tx:8; char rx:8; long divisor:8; long bits:3; long stop:1; long start:1; long reserved:3; long clock_source:8; long pin_selection:8; } UartDev; extern volatile UartDev uart1; If you use SImode to access any of the first four registers, you may end up clearing a status bit you haven't read yet. If you use QImode to write the divisor, you may end up clearing the other bits if gcc doesn't drive the right values onto the bus. With our current code, the mode for access for the above would either always be QI or always be SI; there's no way to have QI for the first four and SI for the rest. The culprit is get_best_mode(), which doesn't know the C type of the field. PCC_BITFIELD_TYPE_MATTERS seems to only control layout, not access. Even if it did solve the type-v-mode problem, turning it on would break ABI compatibility.