Re: Two suggestions for gcc C compiler to extend C language (by WD Smith)
Citerar Warren D Smith : Also, I'm somewhat amazed how it is argued to me that a 9-bit machine the PDP-10 is covered by C fine, but yet, C insists on having everything a multiple of 8 bits with padding bits disallowed, and that too is fine, and both these facts refute me. Wrong. The C language does not insist on multiples of 8. As far as the C standard is concerned it would be perfectly fine to have 9 bit chars, 18 bit short ints, 27 bit ints, 36 bit long, and 72 bit long long. I do not believe GCC has support for any targets like that, but it is allowed by C. And, except for unsigned char any and all the integer types can have some types of padding bits/disallowed bit patterns. Again, perhaps not in any of the targets that GCC supports, but allowed by the C standard.
Re: Two suggestions for gcc C compiler to extend C language (by WD Smith)
On 26/07/16 21:06, Warren D Smith wrote: > OK, you just said you've used packed nybble arrays a couple of times. Yes, a couple of times in 20+ years. And I work with the kind of programming where something like nibble arrays could conceivably be useful. For most C programmers, "int" is the only integer type they ever need or use, and they will probably not even have heard of a "nibble". It is absolutely a non-issue. > Multiplying you by 100,000 that proves if you put it in GCC, > you'd save 200,000 programmer hours, which is equivalent to saving > over 2 lives. That claim is absurd in every way imaginable. > > You just said you've written your own double-length multiply. > Same proof. Again, you are completely wrong. Double-length multiplies are very rarely necessary, so there is no point in having extra compiler features in order to make them simpler. In the rare cases where they /are/ needed, it is not hard to implement them yourself. And gcc has already made the process easier and more efficient over the years, by making real and beneficial advances that have the side-effect of making double-length multiplies easier and more efficient (general optimisation improvements make the code more efficient, the C99 long long removed almost all use-cases for manual extended integer arithmetic, and the __builtin_XXX_overflow instructions are available). It is hard to imagine how you could be more wrong in what you write. I don't know whether you are just ignorant and angry at something, or trolling. > > Thank you for proving my point. > > How many deaths does it take before it is worth putting into GCC? You are now claiming that adding a double-length "mul" builtin to gcc would save lives? Seriously? > And it isn't like I'm suggesting something hard, or which would be > unattractive to users. Making a double-length multiply built-in would not be very hard - but it is far easier to write it in C source code than to add it to the compiler. And yes, it would be unattractive to users - for almost everybody, it would be useless but use an identifier "mul" that could conflict with their own code. And of course there is the key point that the gcc developers are not lazy, despite your claims - they are extremely dedicated and hard working, often in their free time. There are large numbers of /useful/ features (and occasional bug fixes!) that are on their "todo" lists. Any time spent on meaningless ideas that can be handled perfectly well at the moment is time they cannot spend on something the people actually want. > > And thanks for the tip on how to do add-with-carry. > That's nice. Now I have to ask, now you've helpfully demonstrated > how nice it can be, why not put that niceness inside GCC? gcc /has/ that "niceness" - the code I wrote works fine with gcc. (And gcc has the __builtin_XXX_overflow functions - I suspect you did not bother looking at the documentation about the features gcc supports, before ranting about what's missing.) > I mean, if > GCC already is going to > provide div(a,b) -- it was not me who asked for that, it was GCC that > decided to provide it -- "div" was specified in the earliest versions of the C standards, before gcc was conceived. > which I could have just got in plain C using q=a/b; r=a%b; and depended on > optimizer, zero thought required -- then how can just justify GCC > *not* providing addc(a,b) when it is trickier for the programmer, so > you are clearly providing something more helpful since was more > tricky? "div" is documented in the standards - the compiler or library has no option but to support it, even if people can now get at least as good code (in most cases) by writing the operations individually. It is true that it was the gcc folks that made the decision to have a builtin version as an alternative to a library call. They have tried to do that for many simple or commonly used standard library functions, because that is useful for a lot of code. > > Why am I bothering? You prove my point then act as though you proved > opposite. > > Concrete examples? Hell, I suggested stdint.h years and years before > it came along, and I was told I was an idiot. I suggested making a > lot of library functions be builtins, told I was an idiot, and now lo > and behold, years and years later, gcc makes many library functions be > builtins. I complained the stdio library was a disaster waiting to > happen with > buffer overflows, told I was an idiot, and lo and behold, years and > years later people keep trying to work around that, with at least two > people having written nonstandard replacement libraries to try for > safety, and huge billions of dollars estimated to be lost due to this > bad design. Judging by your writing here, you were told that you were an idiot because you are an idiot. There may be the occasional good idea hidden behind the piles of nonsense, insults, rants, accusations and complaints, but they would be hard to see. An
Re: [gimplefe] hacking pass manager
On Tue, Jul 26, 2016 at 11:38 PM, Prathamesh Kulkarni wrote: > On 27 July 2016 at 00:20, Prasad Ghangal wrote: >> On 20 July 2016 at 18:28, Richard Biener wrote: >>> On Wed, Jul 20, 2016 at 1:46 PM, Prathamesh Kulkarni >>> wrote: On 20 July 2016 at 11:34, Richard Biener wrote: > On Tue, Jul 19, 2016 at 10:09 PM, Prasad Ghangal > wrote: >> On 19 July 2016 at 11:04, Richard Biener >> wrote: >>> On July 18, 2016 11:05:58 PM GMT+02:00, David Malcolm >>> wrote: On Tue, 2016-07-19 at 00:52 +0530, Prasad Ghangal wrote: > On 19 July 2016 at 00:25, Richard Biener > wrote: > > On July 18, 2016 8:28:15 PM GMT+02:00, Prasad Ghangal < > > prasad.ghan...@gmail.com> wrote: > > > On 15 July 2016 at 16:13, Richard Biener < > > > richard.guent...@gmail.com> > > > wrote: > > > > On Sun, Jul 10, 2016 at 6:13 PM, Prasad Ghangal > > > > wrote: > > > > > On 8 July 2016 at 13:13, Richard Biener < > > > > > richard.guent...@gmail.com> > > > wrote: > > > > > > On Thu, Jul 7, 2016 at 9:45 PM, Prasad Ghangal > > > wrote: > > > > > > > On 6 July 2016 at 14:24, Richard Biener > > > wrote: > > > > > > > > On Wed, Jul 6, 2016 at 9:51 AM, Prasad Ghangal > > > wrote: > > > > > > > > > On 30 June 2016 at 17:10, Richard Biener > > > wrote: > > > > > > > > > > On Wed, Jun 29, 2016 at 9:13 PM, Prasad Ghangal > > > > > > > > > > wrote: > > > > > > > > > > > On 29 June 2016 at 22:15, Richard Biener > > > wrote: > > > > > > > > > > > > On June 29, 2016 6:20:29 PM GMT+02:00, > > > > > > > > > > > > Prathamesh Kulkarni > > > wrote: > > > > > > > > > > > > > On 18 June 2016 at 12:02, Prasad Ghangal > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > Hi, > > > > > > > > > > > > > > > > > > > > > > > > > > > > I tried hacking pass manager to execute > > > > > > > > > > > > > > only given passes. > > > For this I > > > > > > > > > > > > > > am adding new member as opt_pass > > > > > > > > > > > > > > *custom_pass_list to the > > > function > > > > > > > > > > > > > > structure to store passes need to execute > > > > > > > > > > > > > > and providing the > > > > > > > > > > > > > > custom_pass_list to execute_pass_list() > > > > > > > > > > > > > > function instead of > > > all > > > > > > > > > > > > > passes > > > > > > > > > > > > > > > > > > > > > > > > > > > > for test case like- > > > > > > > > > > > > > > > > > > > > > > > > > > > > int a; > > > > > > > > > > > > > > void __GIMPLE (execute ("tree-ccp1", "tree > > > > > > > > > > > > > > -fre1")) foo() > > > > > > > > > > > > > > { > > > > > > > > > > > > > > bb_1: > > > > > > > > > > > > > > a = 1 + a; > > > > > > > > > > > > > > } > > > > > > > > > > > > > > > > > > > > > > > > > > > > it will execute only given passes i.e. ccp1 > > > > > > > > > > > > > > and fre1 pass > > > on the > > > > > > > > > > > > > function > > > > > > > > > > > > > > > > > > > > > > > > > > > > and for test case like - > > > > > > > > > > > > > > > > > > > > > > > > > > > > int a; > > > > > > > > > > > > > > void __GIMPLE (startwith ("tree-ccp1")) > > > > > > > > > > > > > > foo() > > > > > > > > > > > > > > { > > > > > > > > > > > > > > bb_1: > > > > > > > > > > > > > > a = 1 + a; > > > > > > > > > > > > > > } > > > > > > > > > > > > > > > > > > > > > > > > > > > > it will act as a entry point to the > > > > > > > > > > > > > > pipeline and will > > > execute passes > > > > > > > > > > > > > > starting from given pass. > > > > > > > > > > > > > Bike-shedding: > > > > > > > > > > > > > Would it make sense to have syntax for > > > > > > > > > > > > > defining pass ranges > > > to execute > > > > > > > > > > > > > ? > > > > > > > > > > > > > for instance: > > > > > > > > > > > > > void __GIMPLE(execute (pass_start : > > > > > > > > > > > > > pass_end)) > > > > > > > > > > > > > which would execute all the passes within > > > > > > > > > > > > > range [pass_start, > > > pass_end], > > > > > > > > > > > > > which would be convenient if the range is > > > > > > > > > > > > > large. > > > > > > > > > > > > > > > > > > > > > > > > But it would rely on a particular pass > > > > > > > > > > > > pipeline, f.e. > > > pass-start appearing before pass-end. > > > > > > > > > > > > > > > > > > > > > > > > Currently control
Re: Thread-safety of a profiled binary (and GCOV runtime library)
On 07/26/2016 01:15 AM, Andi Kleen wrote: > You definitely need a new flag: atomic or per thread instrumentation > will almost certainly have significant overhead (either at run time > or in memory). Just making an existing facility a lot of slower > without a way around it is not a good idea. Hi Agree with that! > > BTW iirc there were patches from google on this a few years back. > May be still in some branch. That's great, I'm CCing Google people who worked on PGO, hope they will help me to find patches/discussion about the problem. Martin > > -Andi
GCC 6.2?
Hi all. Don't want to be a noodge but is there any info on a timeline for the 6.2 release? I'm planning a major tools upgrade (from GCC 4.9.2) and I've been kind of putting it off until 6.2 is out so I can jump to that... but the natives are getting restless as they want some C++ features that aren't available in 4.9.2. Usually it seems like the "first patch release" for a new major release is out right around now (3 months after the initial release). Just wondering if there is any info on this or if things are going to be very different for 6.2. Cheers!
Re: GCC 6.2?
On Wed, Jul 27, 2016 at 4:03 PM, Paul Smith wrote: > Hi all. Don't want to be a noodge but is there any info on a timeline > for the 6.2 release? > > I'm planning a major tools upgrade (from GCC 4.9.2) and I've been kind > of putting it off until 6.2 is out so I can jump to that... but the > natives are getting restless as they want some C++ features that aren't > available in 4.9.2. > > Usually it seems like the "first patch release" for a new major release > is out right around now (3 months after the initial release). Just > wondering if there is any info on this or if things are going to be > very different for 6.2. I'm doing 4.9.4 now and 6.2 only afterwards so you can expect 6.2 earliest in about three weeks. Richard. > Cheers!
[RFD] Insane alignment of read-only arrays
Hi, (tl;dr: skip to “questions from me to the GCC developers”) I’ve recently spotted the following in some code I inherited: struct foo { char name[4]; uint8_t len; uint8_t type; }; static const struct foo fooinfo[] = { /* list of 43 members */ }; I’ve seen the obvious two-byte padding (on i386, with -Os) and thought to restructure this into: static const char fooname[][4] = { … }; static const uint8_t foolen[] = { … }; static const uint8_t footype[] = { … }; Colour me surprised when this made the code over fifty bytes longer. After some debugging, I found that the assembly code generated had “.align 32” in front of each of the new structs. After some (well, lots) more debugging, I eventually discovered -fdump-translation-unit (which, in the version I was using, also worked for C, not just C++), which showed me that the alignment was 256 even (only later reduced to 32 as that’s the maximum alignment for i386). Lots of digging later, I found this gem in gcc/config/i386/i386.c: int ix86_data_alignment (tree type, int align) { if (AGGREGATE_TYPE_P (type) && TYPE_SIZE (type) && TREE_CODE (TYPE_SIZE (type)) == INTEGER_CST && (TREE_INT_CST_LOW (TYPE_SIZE (type)) >= 256 || TREE_INT_CST_HIGH (TYPE_SIZE (type))) && align < 256) return 256; Voilà, there we have my culprit – commenting this out resulted in a 12-byte yield (not 42*2 byte, as the code generated for e.g. “strncmp(foo, opname[i], (size_t)oplen[i])” is a bit less optimal than for “strncmp(foo, opinfo[i].name, (size_t)opinfo[i].len)”, but that’s okay)… and a 206-byte reduction for the rest of the codebase. Seeing that ix86_data_alignment() also contains amd64-specific alignment, and that MMX stuff generally needs more alignment, I first did this: && (TREE_INT_CST_LOW (TYPE_SIZE (type)) >= 256 - || TREE_INT_CST_HIGH (TYPE_SIZE (type))) && align < 256) + || TREE_INT_CST_HIGH (TYPE_SIZE (type))) + && (TARGET_MMX || !optimize_size) + && align < 256) return 256; The idea here being that both TARGET_SSE and TARGET_64BIT enable TARGET_MMX, and to do this only for -Os. Then I went into the svn history for this function and discovered that its predecessor in gcc/config/i386/i386.h (the DATA_ALIGNMENT(TYPE, ALIGN) macro) was added in around 2000, before MMX was even a thing, to “improve floating point performance”, but that architectures apparently can do without. Now I’m trying roughly this: […] ix86_constant_alignment (tree exp, int align) { + if (optimize_size && !TARGET_MMX) +return align; […] ix86_data_alignment (tree type, int align) { + if (optimize_size && !TARGET_MMX) +return align; […] ix86_local_alignment (tree type, int align) { + if (optimize_size && !TARGET_MMX) +return align; […] This opens up some questions from me to the GCC developers though: – Is this safe to do? (My baseline here is 3.4.6, so if someone still remembers, please do answer, but the scope of this eMail in total goes beyond that.) – Is this something that GCC trunk could benefit from? – Is the exclusion of MMX and 64BIT required? (Since this code has been there “ever” since even before MMX support landed in GCC, I fear that some of the “required alignment” are done inside this function instead of in other places.) – Even better: is this something we could do for *all* platforms in general? Something like this, in gcc/varasm.c: #ifdef DATA_ALIGNMENT + if (!optimize_size) - align = DATA_ALIGNMENT (TREE_TYPE (decl), align); +align = DATA_ALIGNMENT (TREE_TYPE (decl), align); #endif My aim here is to tighten the density (reduce the size of the individual sections in the .o file and, ideally, the file size of the final executable) of the generated code for -Os while not breaking anything, and leaving the case of not-Os completely alone. Of course I’ll do a full rebuild of MirBSD (which uses -Os in almost all code, only some legacy crap from the 1970s like AT&T nroff uses -O1 or even -O0 as the code doesn’t conform to ISO C) to see if things break, but I’m also interested in the bigger picture, besides I have invested into embedded systems (FreeWRT/OpenADK, but also dietlibc, klibc, etc.) which love small code. Thanks in advance, //mirabilos PS: Please do Cc me, I’m not subscribed. PPS: I’ve exchanged assignment papers with the FSF about GCC, so feel free to just commit anything, if it makes sense. -- ObCaptcha: null
[RFD] Extremely large alignment of read-only strings
Hi, (tl;dr: skip to “questions from me to the GCC developers”) I’ve recently spotted the following in some code I inherited: struct foo { char name[4]; uint8_t len; uint8_t type; }; static const struct foo fooinfo[] = { /* list of 43 members */ }; I’ve seen the obvious two-byte padding (on i386, with -Os) and thought to restructure this into: static const char fooname[][4] = { … }; static const uint8_t foolen[] = { … }; static const uint8_t footype[] = { … }; Colour me surprised when this made the code over fifty bytes longer. After some debugging, I found that the assembly code generated had “.align 32” in front of each of the new structs. After some (well, lots) more debugging, I eventually discovered -fdump-translation-unit (which, in the version I was using, also worked for C, not just C++), which showed me that the alignment was 256 even (only later reduced to 32 as that’s the maximum alignment for i386). Lots of digging later, I found this gem in gcc/config/i386/i386.c: int ix86_data_alignment (tree type, int align) { if (AGGREGATE_TYPE_P (type) && TYPE_SIZE (type) && TREE_CODE (TYPE_SIZE (type)) == INTEGER_CST && (TREE_INT_CST_LOW (TYPE_SIZE (type)) >= 256 || TREE_INT_CST_HIGH (TYPE_SIZE (type))) && align < 256) return 256; Voilà, there we have my culprit – commenting this out resulted in a 12-byte yield (not 42*2 byte, as the code generated for e.g. “strncmp(foo, opname[i], (size_t)oplen[i])” is a bit less optimal than for “strncmp(foo, opinfo[i].name, (size_t)opinfo[i].len)”, but that’s okay)… and a 206-byte reduction for the rest of the codebase. Seeing that ix86_data_alignment() also contains amd64-specific alignment, and that MMX stuff generally needs more alignment, I first did this: && (TREE_INT_CST_LOW (TYPE_SIZE (type)) >= 256 - || TREE_INT_CST_HIGH (TYPE_SIZE (type))) && align < 256) + || TREE_INT_CST_HIGH (TYPE_SIZE (type))) + && (TARGET_MMX || !optimize_size) + && align < 256) return 256; The idea here being that both TARGET_SSE and TARGET_64BIT enable TARGET_MMX, and to do this only for -Os. Then I went into the svn history for this function and discovered that its predecessor in gcc/config/i386/i386.h (the DATA_ALIGNMENT(TYPE, ALIGN) macro) was added in around 2000, before MMX was even a thing, to “improve floating point performance”, but that architectures apparently can do without. Now I’m trying roughly this: […] ix86_constant_alignment (tree exp, int align) { + if (optimize_size && !TARGET_MMX) +return align; […] ix86_data_alignment (tree type, int align) { + if (optimize_size && !TARGET_MMX) +return align; […] ix86_local_alignment (tree type, int align) { + if (optimize_size && !TARGET_MMX) +return align; […] This opens up some questions from me to the GCC developers though: – Is this safe to do? (My baseline here is 3.4.6, so if someone still remembers, please do answer, but the scope of this eMail in total goes beyond that.) – Is this something that GCC trunk could benefit from? – I’ve also been wondering whether this applies to regular strings (not arrays that technically are strings too) as well… – Is the exclusion of MMX and 64BIT required? (Since this code has been there “ever” since even before MMX support landed in GCC, I fear that some of the “required alignment” are done inside this function instead of in other places.) – Even better: is this something we could do for *all* platforms in general? Something like this, in gcc/varasm.c: #ifdef DATA_ALIGNMENT + if (!optimize_size) - align = DATA_ALIGNMENT (TREE_TYPE (decl), align); +align = DATA_ALIGNMENT (TREE_TYPE (decl), align); #endif My aim here is to tighten the density (reduce the size of the individual sections in the .o file and, ideally, the file size of the final executable) of the generated code for -Os while not breaking anything, and leaving the case of not-Os completely alone. Of course I’ll do a full rebuild of MirBSD (which uses -Os in almost all code, only some legacy crap from the 1970s like AT&T nroff uses -O1 or even -O0 as the code doesn’t conform to ISO C) to see if things break, but I’m also interested in the bigger picture, besides I have invested into embedded systems (FreeWRT/OpenADK, but also dietlibc, klibc, etc.) which love small code. Thanks in advance, //mirabilos PS: Please do Cc me, I’m not subscribed. PPS: I’ve exchanged assignment papers with the FSF about GCC, so feel free to just commit anything, if it makes sense. -- FWIW, I'm quite impressed with mksh interactively. I thought it was much *much* more bare bones. But it turns out it beats the living hell out of ksh93 in that respect. I'd even consider it for my daily use if I hadn't wasted half my life on my zsh setup. :-) -- Frank Terbeck in #!/bin/mksh
Re: [RFD] Extremely large alignment of read-only strings
(apologies for the double post, GMane had a hiccup. The latter, which this is a reply to, has one discussion item more, so please ignore the other)
Re: Thread-safety of a profiled binary (and GCOV runtime library)
Resend in plain text mode. On Wed, Jul 27, 2016 at 9:07 AM, Xinliang David Li wrote: > Our experience is that non-atomic counter update (the current > implementation) rarely result in corrupted profile (in heavily threaded > environment) -- it usually results in some profile insanity which can be > corrected with -fprofile-correction -- otherwise we would have been forced > to go the TLS route. > > The profile corruption usually happens when server program crashed during > dumping (server program usually do not exit, so forcing it to exit to dump > profile can cause problem). Our solution is to introduce profile runtime API > to be invoked by the user. > > We added the atomic support mostly for linux kernel FDO. The support is in > google/gcc_49 branch. The option description is: > > ; fprofile-generate-atomic=0: disable aotimically update. > ; fprofile-generate-atomic=1: aotimically update edge profile counters. > ; fprofile-generate-atomic=2: aotimically update value profile counters. > ; fprofile-generate-atomic=3: aotimically update edge and value profile > counters. > ; other values will be ignored (fall back to the default of 0). > fprofile-generate-atomic= > Common Joined UInteger Report Var(flag_profile_gen_atomic) Init(0) > Optimization > fprofile-generate-atomic=[0..3] Atomically increments for profile counters. > > thanks, > > David > > > On Wed, Jul 27, 2016 at 5:05 AM, Martin Liška wrote: >> >> On 07/26/2016 01:15 AM, Andi Kleen wrote: >> > You definitely need a new flag: atomic or per thread instrumentation >> > will almost certainly have significant overhead (either at run time >> > or in memory). Just making an existing facility a lot of slower >> > without a way around it is not a good idea. >> >> Hi >> >> Agree with that! >> >> > >> > BTW iirc there were patches from google on this a few years back. >> > May be still in some branch. >> >> That's great, I'm CCing Google people who worked on PGO, hope they will >> help me to find patches/discussion about the problem. >> >> Martin >> >> > >> > -Andi >> >
Re: Need help with PR71976 combine.c::get_last_value returns a wrong result
On Tue, Jul 26, 2016 at 03:38:18PM +0200, Georg-Johann Lay wrote: > >>@@ -13206,6 +13206,13 @@ get_last_value (const_rtx x) > >> && DF_INSN_LUID (rsp->last_set) >= subst_low_luid) > >> return 0; > >> > >>+ /* If the lookup is for a hard register make sure that value contains > >>at > >>least > >>+ as many bits as x does. */ > >>+ > >>+ if (regno < FIRST_PSEUDO_REGISTER > >>+ && GET_MODE_PRECISION (rsp->last_set_mode) < GET_MODE_PRECISION > >>(GET_MODE (x))) > >>+return 0; > >>+ > >> /* If the value has all its registers valid, return it. */ > >> if (get_last_value_validate (&value, rsp->last_set, > >> rsp->last_set_label, > >> 0)) > >> return value; > > > >That might be a bit harsh. > > In what respect? It disables all kinds of optimisations that work now. > >First things first: is the information recorded correct? > > I think yes, but care must be taken what may be concluded from that > information. From a set of 8 bits you cannot draw conclusion about all 64 > bits; this should be obvious enough :-) :-) > In the above case rsp[regno] holds only information for 1 sub-byte. In > order to get the complete DImode story we would have to get the info for > all sub-parts and then put them together... Yes, and the rsp stuff does not do that. I am testing the following patch. Thanks, Segher diff --git a/gcc/combine.c b/gcc/combine.c index 77e0d2b..dec6226 100644 --- a/gcc/combine.c +++ b/gcc/combine.c @@ -9977,6 +9977,9 @@ reg_num_sign_bit_copies_for_combine (const_rtx x, machine_mode mode, return NULL; } + if (GET_MODE_PRECISION (rsp->last_set_mode) != GET_MODE_PRECISION (mode)) +return NULL; + tem = get_last_value (x); if (tem != 0) return tem; -- 1.9.3
Remove deprecated std::has_trivial_xxx traits
I propose that we remove the following non-standard traits in GCC 7: /// has_trivial_default_constructor (temporary legacy) template struct has_trivial_default_constructor : public integral_constant { } _GLIBCXX_DEPRECATED; /// has_trivial_copy_constructor (temporary legacy) template struct has_trivial_copy_constructor : public integral_constant { } _GLIBCXX_DEPRECATED; /// has_trivial_copy_assign (temporary legacy) template struct has_trivial_copy_assign : public integral_constant { } _GLIBCXX_DEPRECATED; They were in C++0x drafts but were removed before the final C++11 standard, so have never been part of ISO C++. They've been deprecated since GCC 5.1: https://gcc.gnu.org/gcc-5/changes.html People who need that functionality can still use the built-ins directly, we don't need to define non-standard traits. Alternatively, we could move them to __gnu_cxx, so they don't pollute the namespace by default, or only define them when __STRICT_ANSI__ is not defined. I prefer simply removing them.
Re: Remove deprecated std::has_trivial_xxx traits
2016-07-27 20:25 GMT+02:00 Jonathan Wakely : > I propose that we remove the following non-standard traits in GCC 7: > > /// has_trivial_default_constructor (temporary legacy) > template >struct has_trivial_default_constructor >: public integral_constant >{ } _GLIBCXX_DEPRECATED; > > /// has_trivial_copy_constructor (temporary legacy) > template >struct has_trivial_copy_constructor >: public integral_constant >{ } _GLIBCXX_DEPRECATED; > > /// has_trivial_copy_assign (temporary legacy) > template >struct has_trivial_copy_assign >: public integral_constant >{ } _GLIBCXX_DEPRECATED; > > They were in C++0x drafts but were removed before the final C++11 > standard, so have never been part of ISO C++. They've been deprecated > since GCC 5.1: https://gcc.gnu.org/gcc-5/changes.html > > People who need that functionality can still use the built-ins > directly, we don't need to define non-standard traits. > > Alternatively, we could move them to __gnu_cxx, so they don't pollute > the namespace by default, or only define them when __STRICT_ANSI__ is > not defined. I prefer simply removing them. +1 for removing them. - Daniel
Re: Need help with PR71976 combine.c::get_last_value returns a wrong result
Segher Boessenkool schrieb: On Tue, Jul 26, 2016 at 03:38:18PM +0200, Georg-Johann Lay wrote: @@ -13206,6 +13206,13 @@ get_last_value (const_rtx x) && DF_INSN_LUID (rsp->last_set) >= subst_low_luid) return 0; + /* If the lookup is for a hard register make sure that value contains at least + as many bits as x does. */ + + if (regno < FIRST_PSEUDO_REGISTER + && GET_MODE_PRECISION (rsp->last_set_mode) < GET_MODE_PRECISION (GET_MODE (x))) +return 0; + /* If the value has all its registers valid, return it. */ if (get_last_value_validate (&value, rsp->last_set, rsp->last_set_label, 0)) return value; That might be a bit harsh. In what respect? It disables all kinds of optimisations that work now. First things first: is the information recorded correct? I think yes, but care must be taken what may be concluded from that information. From a set of 8 bits you cannot draw conclusion about all 64 bits; this should be obvious enough :-) :-) In the above case rsp[regno] holds only information for 1 sub-byte. In order to get the complete DImode story we would have to get the info for all sub-parts and then put them together... Yes, and the rsp stuff does not do that. I am testing the following patch. Thanks, Segher diff --git a/gcc/combine.c b/gcc/combine.c index 77e0d2b..dec6226 100644 --- a/gcc/combine.c +++ b/gcc/combine.c @@ -9977,6 +9977,9 @@ reg_num_sign_bit_copies_for_combine (const_rtx x, machine_mode mode, return NULL; } + if (GET_MODE_PRECISION (rsp->last_set_mode) != GET_MODE_PRECISION (mode)) +return NULL; + tem = get_last_value (x); if (tem != 0) return tem; But the problem is inside get_last_value. You'd have to add such a check at /all/ call sites of that function. Using a value for a hard reg that's been set in a smaller mode than the mode of get_last_value is just wrong. For example combine tracks bits which are known to be zero, and if get_last_value is used in a similar situation, then the conclusion about zero-bits is also wrong. Would be interesting to patch get_last_value so that it issues some diagnostic if if (regno < FIRST_PSEUDO_REGISTER && (GET_MODE_PRECISION (rsp->last_set_mode) < GET_MODE_PRECISION (GET_MODE (x and then run the testsuite against that compiler. I would expect that the condition doesn't even trigger on x86. Johann
Re: Need help with PR71976 combine.c::get_last_value returns a wrong result
On Wed, Jul 27, 2016 at 09:14:27PM +0200, Georg-Johann Lay wrote: > >diff --git a/gcc/combine.c b/gcc/combine.c > >index 77e0d2b..dec6226 100644 > >--- a/gcc/combine.c > >+++ b/gcc/combine.c > >@@ -9977,6 +9977,9 @@ reg_num_sign_bit_copies_for_combine (const_rtx x, > >machine_mode mode, > > return NULL; > > } > > > >+ if (GET_MODE_PRECISION (rsp->last_set_mode) != GET_MODE_PRECISION > >(mode)) > >+return NULL; > >+ > > tem = get_last_value (x); > > if (tem != 0) > > return tem; > > But the problem is inside get_last_value. You'd have to add such a > check at /all/ call sites of that function. Using a value for a hard > reg that's been set in a smaller mode than the mode of get_last_value is > just wrong. Things like nonzero_bits explicitly deal with it. For MODE_INT values at least; the code looks a bit shaky. > For example combine tracks bits which are known to be zero, and if > get_last_value is used in a similar situation, then the conclusion about > zero-bits is also wrong. How so? I don't see it. > Would be interesting to patch get_last_value so that it issues some > diagnostic if > > if (regno < FIRST_PSEUDO_REGISTER > && (GET_MODE_PRECISION (rsp->last_set_mode) > < GET_MODE_PRECISION (GET_MODE (x > > and then run the testsuite against that compiler. I would expect that > the condition doesn't even trigger on x86. Why restrict this to hard regs at all? Yes I know it isn't supposed to happen for pseudos, but that also means you do not need to check? Segher
gcc-4.9-20160727 is now available
Snapshot gcc-4.9-20160727 is now available on ftp://gcc.gnu.org/pub/gcc/snapshots/4.9-20160727/ and on various mirrors, see http://gcc.gnu.org/mirrors.html for details. This snapshot has been generated from the GCC 4.9 SVN branch with the following options: svn://gcc.gnu.org/svn/gcc/branches/gcc-4_9-branch revision 238801 You'll find: gcc-4.9-20160727.tar.bz2 Complete GCC MD5=91bcf7e0231ab8a729c8581b6258fdd0 SHA1=4a86f0fd083df1fbc353b524d31844245aa31404 Diffs from 4.9-20160720 are available in the diffs/ subdirectory. When a particular snapshot is ready for public consumption the LATEST-4.9 link is updated and a message is sent to the gcc list. Please do not use a snapshot before it has been announced that way.