Finding canonical names of systems
How are you supposed to find the canonical name of a system (of known type) in CPU-Vendor-OS form in the general case? If you have access to a system of that particular type, you can run config.guess to find out, but you might not have, and that approach won't work for many systems anyway. The canonical name needs to be known e.g. when cross-compiling and building cross-compilers. The only way I could find to get a list of canonical CPU, Vendor and OS strings was to dig through /usr/share/gnuconfig/config.sub on my GNU/Linux systems, which needless to say is about as bad as it gets from a documentation perspective. Is there any other way to get a list mapping CPU's, Vendors and OS's to their canonical strings? If there isn't, I think it's making things much more complicated than they should be. /Ulf Magnusson
Re: Finding canonical names of systems
On 11/28/06, Mike Stump <[EMAIL PROTECTED]> wrote: [ first, this is the wrong list to ask such question, gcc-help is the right one ] On Nov 27, 2006, at 7:25 PM, Ulf Magnusson wrote: > How are you supposed to find the canonical name of a system (of > known type) in CPU-Vendor-OS form in the general case? In the general case, you ask someone that has such a machine to run config.guess, or failing that, you ask someone, or failing that, you just invent the obvious string and use it. That still feels like a very roundabout way to do it, and it won't work for systems that don't have Unices for them (like some small embedded systems). Having to ask someone just to find this information also feels a bit silly, when it is at crucial as it is e.g. in building cross-compilers. The correct strings to use aren't always immediately obvious either, if you feel like guessing. It think the lack of documentation might scare people away. Most portable software doesn't much care just what configuration you specify, some very non-portable software will fail to function unless you provide exactly the right string. gcc is of the later type, if you're interested in building it. Yes, the reason I ended up here is that I wanted to build GCC as a cross-compiler, and was disappointed that no one had even made a comprehensible list of how to represent different CPU's, vendors and OS's. I see this as a bug in the documentation. > If you have access to a system of that particular type, you can run > config.guess to find out, but you might not have, and that approach > won't work for many systems anyway. That approach always works on all host systems. :-) If it didn't, that'd be a bug and someone would fix it. As previously mentioned, that won't work on systems that can't even run sh, like tiny embedded devices, and it's still a somewhat roundabout and silly method. > The canonical name needs to be known e.g. when cross-compiling and > building cross-compilers. Ah, for crosses, you have to know what you want, and what you want is what you specify. If your question is, what do you want, well, you want what you want. Either, it works, or, you've not ported the compiler. For example, you can configure --target=arm, if you want an arm, or -- target=m68k if you want an m68k, or sparc, if you want sparc, or ppc if you want ppc, or powerpc if you want powerpc, or x86_64, if you want x86_64, or arm-elf, if you want arm-elf, or sparc-aout if you want that. The list _is_ endless. If you interested in a specific target, tell us which one and we'll answer the question specifically. Yes, and if you want Foo. The problem is knowing what the string representation is. What I'd like to see is a list like the following. .. Materola 2560xx series -> ma256k Pute1000 64 bit -> p1k64 .. Macrosoft Inc. -> mcinc Amtel -> amt .. FroBar OS Version X.Y -> froos-x.y Lunix x.y -> lunix-x.y .. If you want pre-formed ideas for targets that might be useful, you can check out: http://gcc.gnu.org/install/specific.html http://gcc.gnu.org/buildstat.html http://gcc.gnu.org/ml/gcc-testresults/ I was thinking there was one other that tried to be exhaustive, but maybe we removed the complete list years ago. Aside from that, yes, reading though the config type files is yet another way. Those are helpful links, but I still think there should be an easy-to-find comprehensive list documenting the strings to use in the canonical triplet for particular CPU's, vendors and OS's. "How do I represent system foo?" seems like a reasonable question to ask, and "dig through a shell script" is not very satisfactory. If that is the only way to find a truly comprehensive list, it should at least be mentioned in the documentation. /Ulf Magnusson
Re: Finding canonical names of systems
On 11/29/06, Michael Eager <[EMAIL PROTECTED]> wrote: Ulf Magnusson wrote: > How are you supposed to find the canonical name of a system (of known > type) in CPU-Vendor-OS form in the general case? If you have access to > a system of that particular type, you can run config.guess to find > out, but you might not have, and that approach won't work for many > systems anyway. The canonical name needs to be known e.g. when > cross-compiling and building cross-compilers. > > The only way I could find to get a list of canonical CPU, Vendor and > OS strings was to dig through /usr/share/gnuconfig/config.sub on my > GNU/Linux systems, which needless to say is about as bad as it gets > from a documentation perspective. Is there any other way to get a list > mapping CPU's, Vendors and OS's to their canonical strings? If there > isn't, I think it's making things much more complicated than they > should be. Strictly speaking, there isn't anything that is a canonical name for a particular configuration, if you mean a single correct name. GCC uses the vendor name to simplify figuring out the desired target architecture. For most cross-compilations, the vendor name is ignored. There may be a multitude of vendors, for example. (How many MIPS vendors can you name?) It would be helpful if the documentation said this. I believe I'm not the only one who immediately went hunting in the docs for some kind of list with different CPU's, vendors and OS's, or at least some guide on how to find a string for your configuration. Take a look at configure and config.gcc. Find the architecture you are interested in. Look at the names defined for the architecture and pick the best one. For example, in configure you will find powerpc-*-eabi) noconfigdirs="$noconfigdirs ${libgcj}" ;; This says that powerpc--eabi is a valid configuration. This is further refined in config.gcc, where you will find that powerpc-*-eabi is a bit different from powerpc-*-eabisim. This should also be in the doc, if it is the way you have to do it. Right now, I believe it pretty much glosses over the issue of how to find a suitable string for the configuration you want to represent. If you are looking for a comprehensive list of all possible configurations, rather than just trying to find the correct one for your particular application, you will find that there are an infinite number of configurations. I understand this now, but the docs could be more helpful in explaining how to find the string to use for your configuration. While searching for an answer, I noticed that lots of people seem to have had problems with cross-compilation that to me look more like problems in the documentation, which I find a bit sad. /Ulf Magnusson
Re: Finding canonical names of systems
On 11/29/06, Michael Eager <[EMAIL PROTECTED]> wrote: Ulf Magnusson wrote: > While searching for an answer, I noticed that lots of people seem > to have had problems with cross-compilation that to me look more > like problems in the documentation, which I find a bit sad. Rather than repeatedly complain, the most constructive contribution would be to contribute to the project. You can feel sad all you want, but being patronizing is not going to get much sympathy. I'm sorry if I came off as patronizing, it's not the way it was meant to sound. It's just that I've seen a lot of open source software that has this problem, and I don't like it because I think it hinders the spread of open source software. I'd be happy to contribute some documentation on this. I just hope I have a firm enough grip on the issue. Where should I send drafts for review? Is there some other resource I should be aware of besides http://gcc.gnu.org/contribute.html? /Ulf Magnusson
Suboptimal __restrict optimization?
Hi, Given the code class C { void f(int *p); int q; }; void C::f(int * __restrict p) __restrict { q += 10; *p = 7; q += 10; } g++ 4.5.2 with -O3 generates the following for C::f() (prologue and epilogue omitted): mov0x8(%ebp),%eax // eax = this (= &q) mov0xc(%ebp),%ecx // ecx = p mov(%eax),%edx// edx = q movl $0x7,(%ecx)// *p = 7 add$0x14,%edx // q += 20 mov%edx,(%eax)// save q If C::f() is rearranged as void C::f(int * __restrict p) __restrict { *p = 7; q += 10; q += 10; } the following is generated instead: mov0x8(%ebp),%eax // eax = this (= &q) mov0xc(%ebp),%edx // edx = p movl $0x7,(%edx)// *p = 7 addl $0x14,(%eax) // q += 20 Is there some reason why GCC couldn't generate this code for the first version of C::f()? Is this a failure of optimization, or am I missing something in how __restricted works? /Ulf
Re: Suboptimal __restrict optimization?
On Mon, Oct 3, 2011 at 10:22 PM, Ian Lance Taylor wrote: > Ulf Magnusson writes: > >> Is there some reason why GCC couldn't generate this code for the first >> version of C::f()? Is this a failure of optimization, or am I missing >> something in how __restricted works? > > It's a failure of optimization. > > Ian > Is this something that has been improved in 4.6.x? (Sorry for the initial non-reply-all.)
Option to make unsigned->signed conversion always well-defined?
Hi, I've been experimenting with different methods for emulating the signed overflow of an 8-bit CPU. The method I've found that seems to generate the most efficient code on both ARM and x86 is bool overflow(unsigned int a, unsigned int b) { const unsigned int sum = (int8_t)a + (int8_t)b; return (int8_t)sum != sum; } (The real function would probably be 'inline', of course. Regs are stored in overlong variables, hence 'unsigned int'.) Looking at the spec, it unfortunately seems the behavior of this function is undefined, as it relies on signed int addition wrapping, and that (int8_t)sum truncates bits. Is there some way to make this guaranteed safe with GCC without resorting to inline asm? Locally enabling -fwrap takes care of the addition, but that still leaves the conversion. /Ulf
Re: Option to make unsigned->signed conversion always well-defined?
On Wed, Oct 5, 2011 at 10:11 PM, Ulf Magnusson wrote: > Hi, > > I've been experimenting with different methods for emulating the > signed overflow of an 8-bit CPU. The method I've found that seems to > generate the most efficient code on both ARM and x86 is > > bool overflow(unsigned int a, unsigned int b) { > const unsigned int sum = (int8_t)a + (int8_t)b; > return (int8_t)sum != sum; > } > > (The real function would probably be 'inline', of course. Regs are > stored in overlong variables, hence 'unsigned int'.) > > Looking at the spec, it unfortunately seems the behavior of this > function is undefined, as it relies on signed int addition wrapping, > and that (int8_t)sum truncates bits. Is there some way to make this > guaranteed safe with GCC without resorting to inline asm? Locally > enabling -fwrap takes care of the addition, but that still leaves the > conversion. > > /Ulf > Is *((int8_t*)&sum) safe (assuming little endian)? Unfortunately that seems to generate worse code. On X86 it generates the following (GCC 4.5.2): 0050 <_Z9overflow4jj>: 50: 83 ec 10sub$0x10,%esp 53: 0f be 54 24 18 movsbl 0x18(%esp),%edx 58: 0f be 44 24 14 movsbl 0x14(%esp),%eax 5d: 8d 04 02lea(%edx,%eax,1),%eax 60: 0f be d0movsbl %al,%edx 63: 39 d0 cmp%edx,%eax 65: 0f 95 c0setne %al 68: 83 c4 10add$0x10,%esp 6b: c3 ret With the straight (int8_t) cast you get 50: 0f be 54 24 08 movsbl 0x8(%esp),%edx 55: 0f be 44 24 04 movsbl 0x4(%esp),%eax 5a: 8d 04 02lea(%edx,%eax,1),%eax 5d: 0f be d0movsbl %al,%edx 60: 39 c2 cmp%eax,%edx 62: 0f 95 c0setne %al 65: c3 ret What's with the extra add/sub of ESP? /Ulf
Re: Option to make unsigned->signed conversion always well-defined?
On Thu, Oct 6, 2011 at 12:55 AM, Pedro Pedruzzi wrote: > Em 05-10-2011 17:11, Ulf Magnusson escreveu: >> Hi, >> >> I've been experimenting with different methods for emulating the >> signed overflow of an 8-bit CPU. > > You would like to check whether a 8-bit signed addition will overflow or > not, given the two operands. Is that correct? > > As you used the word `emulating', I am assuming that your function will > not run by the mentioned CPU. > No, it'll most likely only run on systems with a wider bitness. > Does this 8-bit CPU use two's complement representation? Yes, and the criterion for signed overflow is "both numbers have the same sign, but the sign of the sum is different". Should have made that more clear. > >> The method I've found that seems to >> generate the most efficient code on both ARM and x86 is >> >> bool overflow(unsigned int a, unsigned int b) { >> const unsigned int sum = (int8_t)a + (int8_t)b; >> return (int8_t)sum != sum; >> } >> >> (The real function would probably be 'inline', of course. Regs are >> stored in overlong variables, hence 'unsigned int'.) >> >> Looking at the spec, it unfortunately seems the behavior of this >> function is undefined, as it relies on signed int addition wrapping, >> and that (int8_t)sum truncates bits. Is there some way to make this >> guaranteed safe with GCC without resorting to inline asm? Locally >> enabling -fwrap takes care of the addition, but that still leaves the >> conversion. > > I believe the cast from unsigned int to int8_t is implementation-defined > for values that can't be represented in int8_t (e.g. 0xff). A kind of > `undefined behavior' as well. > > I tried: > > bool overflow(unsigned int a, unsigned int b) { >const unsigned int sum = a + b; >return ((a & 0x80) == (b & 0x80)) && ((a & 0x80) != (sum & 0x80)); > } > > But it is not as efficient as yours. > > -- > Pedro Pedruzzi > Yeah, I tried similar bit-trickery along the lines of bool overflow(unsigned int a, unsigned int b) { const uint8_t ab = (uint8_t)a; const uint8_t bb = (uint8_t)b; const uint8_t sum = ab + bb; return (ab ^ bb) & ~(ab ^ sum) & 0x80; } , but it doesn't seem to generate very efficient code. /Ulf
Re: Option to make unsigned->signed conversion always well-defined?
On Thu, Oct 6, 2011 at 10:25 AM, Ulf Magnusson wrote: > On Thu, Oct 6, 2011 at 12:55 AM, Pedro Pedruzzi > wrote: >> Em 05-10-2011 17:11, Ulf Magnusson escreveu: >>> Hi, >>> >>> I've been experimenting with different methods for emulating the >>> signed overflow of an 8-bit CPU. >> >> You would like to check whether a 8-bit signed addition will overflow or >> not, given the two operands. Is that correct? >> >> As you used the word `emulating', I am assuming that your function will >> not run by the mentioned CPU. >> > > No, it'll most likely only run on systems with a wider bitness. > >> Does this 8-bit CPU use two's complement representation? > > Yes, and the criterion for signed overflow is "both numbers have the > same sign, but the sign of the sum is different". Should have made > that more clear. > >> >>> The method I've found that seems to >>> generate the most efficient code on both ARM and x86 is >>> >>> bool overflow(unsigned int a, unsigned int b) { >>> const unsigned int sum = (int8_t)a + (int8_t)b; >>> return (int8_t)sum != sum; >>> } >>> >>> (The real function would probably be 'inline', of course. Regs are >>> stored in overlong variables, hence 'unsigned int'.) >>> >>> Looking at the spec, it unfortunately seems the behavior of this >>> function is undefined, as it relies on signed int addition wrapping, >>> and that (int8_t)sum truncates bits. Is there some way to make this >>> guaranteed safe with GCC without resorting to inline asm? Locally >>> enabling -fwrap takes care of the addition, but that still leaves the >>> conversion. >> >> I believe the cast from unsigned int to int8_t is implementation-defined >> for values that can't be represented in int8_t (e.g. 0xff). A kind of >> `undefined behavior' as well. >> >> I tried: >> >> bool overflow(unsigned int a, unsigned int b) { >> const unsigned int sum = a + b; >> return ((a & 0x80) == (b & 0x80)) && ((a & 0x80) != (sum & 0x80)); >> } >> >> But it is not as efficient as yours. >> >> -- >> Pedro Pedruzzi >> > > Yeah, I tried similar bit-trickery along the lines of > > bool overflow(unsigned int a, unsigned int b) { > const uint8_t ab = (uint8_t)a; > const uint8_t bb = (uint8_t)b; > const uint8_t sum = ab + bb; > return (ab ^ bb) & ~(ab ^ sum) & 0x80; > } > > , but it doesn't seem to generate very efficient code. > > /Ulf > Might as well do bool overflowbit(unsigned int a, unsigned int b) { const unsigned int sum = a + b; return (a ^ b) & ~(a ^ sum) & 0x80; } But still not very good output compared to other approaches as expected. /Ulf
Re: Option to make unsigned->signed conversion always well-defined?
On Thu, Oct 6, 2011 at 11:04 AM, Miles Bader wrote: > Ulf Magnusson writes: >> Might as well do >> >> bool overflowbit(unsigned int a, unsigned int b) { >> const unsigned int sum = a + b; >> return (a ^ b) & ~(a ^ sum) & 0x80; >> } >> >> But still not very good output compared to other approaches as expected. > > How about: > > bool overflowbit2(unsigned int a, unsigned int b) > { > const unsigned int sum = a + b; > return ~(a ^ b) & sum & 0x80; > } > > ? > > I thik it has the same results as your function... > [I just made a table of all 8 possibilities, and checked!] > > -miles > > -- > Circus, n. A place where horses, ponies and elephants are permitted to see > men, women and children acting the fool. > Ops, should have been return ~(a ^ b) & (a ^ sum) & 0x80 ~(a ^ b) gives 1 in the sign bit position if the signs are the same, and (a ^ sum) gives 1 if it's different in the sum. A clearer way of writing it (that also generates suboptimal code) is bool overflow(unsigned int a, unsigned int b) { const unsigned asign = a & 0x80; const unsigned bsign = b & 0x80; const unsigned sumsign = (a + b) & 0x80; return (asign == bsign) && (asign != sumsign); } Seems bit-fiddling isn't the way to go. Maybe I should take this to gnu-help as it isn't really development-related. /Ulf
Re: Option to make unsigned->signed conversion always well-defined?
(I'll cross-post this to gcc and keep it on gcc-help after that.) On Thu, Oct 6, 2011 at 4:46 PM, Andrew Haley wrote: > > inline int8_t as_signed_8 (unsigned int a) { > a &= 0xff; > return a & 0x80 ? (int)a - 0x100 : a; > } > > int overflow(unsigned int a, unsigned int b) { > int sum = as_signed_8(a) + as_signed_8(b); > return as_signed_8(sum) != sum; > } > > Andrew. > That's a really neat trick, and seems to generate identical code. Thanks! I'd be interesting to know if this version produces equally efficient code with MSVC. To summarize what we have so far, here's four different methods along with the code generated for X86 and ARM (GCC 4.5.2): #include inline int8_t as_signed_8(unsigned int a) { a &= 0xff; return a & 0x80 ? (int)a - 0x100 : a; } bool overflow_range(unsigned int a, unsigned int b) { const int sum = as_signed_8(a) + as_signed_8(b); return sum < -128 || sum > 127; } bool overflow_bit(unsigned int a, unsigned int b) { const unsigned int sum = a + b; return ~(a ^ b) & (a ^ sum) & 0x80; } bool overflow_unsafe(unsigned int a, unsigned int b) { const unsigned int sum = (int8_t)a + (int8_t)b; return (int8_t)sum != sum; } bool overflow_safe(unsigned int a, unsigned int b) { const int sum = as_signed_8(a) + as_signed_8(b); return as_signed_8(sum) != sum; } Output for X86 with -O3 -fomit-frame-pointer: <_Z14overflow_rangejj>: 0: 0f be 54 24 04 movsbl 0x4(%esp),%edx 5: 0f be 44 24 08 movsbl 0x8(%esp),%eax a: 8d 84 02 80 00 00 00lea0x80(%edx,%eax,1),%eax 11: 3d ff 00 00 00 cmp$0xff,%eax 16: 0f 97 c0seta %al 19: c3 ret 1a: 8d b6 00 00 00 00 lea0x0(%esi),%esi 0020 <_Z12overflow_bitjj>: 20: 8b 54 24 08 mov0x8(%esp),%edx 24: 8b 4c 24 04 mov0x4(%esp),%ecx 28: 89 d0 mov%edx,%eax 2a: 31 c8 xor%ecx,%eax 2c: 01 ca add%ecx,%edx 2e: 31 ca xor%ecx,%edx 30: f7 d0 not%eax 32: 21 d0 and%edx,%eax 34: a8 80 test $0x80,%al 36: 0f 95 c0setne %al 39: c3 ret 3a: 8d b6 00 00 00 00 lea0x0(%esi),%esi 0040 <_Z15overflow_unsafejj>: 40: 0f be 54 24 08 movsbl 0x8(%esp),%edx 45: 0f be 44 24 04 movsbl 0x4(%esp),%eax 4a: 8d 04 02lea(%edx,%eax,1),%eax 4d: 0f be d0movsbl %al,%edx 50: 39 c2 cmp%eax,%edx 52: 0f 95 c0setne %al 55: c3 ret 56: 8d 76 00lea0x0(%esi),%esi 59: 8d bc 27 00 00 00 00lea0x0(%edi,%eiz,1),%edi 0060 <_Z13overflow_safejj>: 60: 0f be 54 24 08 movsbl 0x8(%esp),%edx 65: 0f be 44 24 04 movsbl 0x4(%esp),%eax 6a: 8d 04 02lea(%edx,%eax,1),%eax 6d: 0f be d0movsbl %al,%edx 70: 39 c2 cmp%eax,%edx 72: 0f 95 c0setne %al 75: c3 ret Output for ARM with -O3 -fomit-frame-pointer -mthumb -march=armv7: <_Z14overflow_rangejj>: 0: b249sxtbr1, r1 2: b240sxtbr0, r0 4: 1808addsr0, r1, r0 6: 3080addsr0, #128; 0x80 8: 28ffcmp r0, #255; 0xff a: bf94ite ls c: 2000movls r0, #0 e: 2001movhi r0, #1 10: 4770bx lr 12: bf00nop 14: f3af 8000 nop.w 18: f3af 8000 nop.w 1c: f3af 8000 nop.w 0020 <_Z12overflow_bitjj>: 20: 180baddsr3, r1, r0 22: 4041eorsr1, r0 24: ea83 0200 eor.w r2, r3, r0 28: ea22 0001 bic.w r0, r2, r1 2c: f3c0 10c0 ubfxr0, r0, #7, #1 30: 4770bx lr 32: bf00nop 34: f3af 8000 nop.w 38: f3af 8000 nop.w 3c: f3af 8000 nop.w 0040 <_Z15overflow_unsafejj>: 40: b242sxtbr2, r0 42: b249sxtbr1, r1 44: 1888addsr0, r1, r2 46: b243sxtbr3, r0 48: 1a18subsr0, r3, r0 4a: bf18it ne 4c: 2001movne r0, #1 4e: 4770bx lr 0050 <_Z13overflow_safejj>: 50: b242sxtbr2, r0 52: b249sxtbr1, r1 54: 1888addsr0, r1, r2 56: b243sxtbr3, r0 58: 1a18subsr0, r3, r0 5a: bf18it ne 5c: 2001movne r0, #1 5e: 4770bx lr Not sure which version would be fastest on ARM (
Re: Option to make unsigned->signed conversion always well-defined?
On Thu, Oct 6, 2011 at 11:31 PM, Florian Weimer wrote: > * Ulf Magnusson: > >> I've been experimenting with different methods for emulating the >> signed overflow of an 8-bit CPU. The method I've found that seems to >> generate the most efficient code on both ARM and x86 is >> >> bool overflow(unsigned int a, unsigned int b) { >> const unsigned int sum = (int8_t)a + (int8_t)b; >> return (int8_t)sum != sum; >> } > > There's a GCC extension which is relevant here: > > | For conversion to a type of width N, the value is reduced modulo 2^N > | to be within range of the type; no signal is raised. > > <http://gcc.gnu.org/onlinedocs/gcc/Integers-implementation.html#Integers-implementation> > > Using that, you can replace the final "& 0x80" with a signed > comparison to zero, which should be give you the best possible code > (for the generic RISC). You only need to hunt down a copy of Hacker's > Delight or find the right bit twiddling by other means. 8-) > Are you thinking of something like this? bool overflow_bit2(unsigned int a, unsigned int b) { const unsigned int ashift = a << 24; const unsigned int bshift = b << 24; const unsigned int sum = a + b; return (int)(~(a ^ b) & (a ^ sum)) < 0; } That version generates 80: 180baddsr3, r1, r0 82: 4041eorsr1, r0 84: ea83 0200 eor.w r2, r3, r0 88: ea22 0001 bic.w r0, r2, r1 8c: 0fc0lsrsr0, r0, #31 8e: 4770bx lr Whereas the unshifted version generates 40: 180baddsr3, r1, r0 42: 4041eorsr1, r0 44: ea83 0200 eor.w r2, r3, r0 48: ea22 0001 bic.w r0, r2, r1 4c: f3c0 10c0 ubfxr0, r0, #7, #1 50: 4770bx lr So maybe a bit better. (I'm no ARM pro, but the compiler does seem to take advantage of the fact that it's testing the real sign bit at least.) Btw, & 0x8000 generates the same code. /Ulf
Re: Option to make unsigned->signed conversion always well-defined?
On Fri, Oct 7, 2011 at 7:35 PM, Florian Weimer wrote: > * Ulf Magnusson: > >> Are you thinking of something like this? >> >> bool overflow_bit2(unsigned int a, unsigned int b) { >> const unsigned int ashift = a << 24; >> const unsigned int bshift = b << 24; >> const unsigned int sum = a + b; >> return (int)(~(a ^ b) & (a ^ sum)) < 0; >> } > > Yes, but rather like : > > bool overflow_bit2(unsigned char a, unsigned char b) { > const unsigned char sum = a + b; > return ((signed char)(~(a ^ b) & (a ^ sum))) < 0; > } > > It still results in abysmal code, given that this should result in two > or three instructions on most architectures. > > Are machine code insertions an option? > Tried that version, but it seems to generate worse (or bigger anyway - haven't benchmarked it) code: 90: eb01 0c00 add.w ip, r1, r0 94: b2c2uxtbr2, r0 96: ea82 030c eor.w r3, r2, ip 9a: ea82 0101 eor.w r1, r2, r1 9e: ea23 0001 bic.w r0, r3, r1 a2: f3c0 10c0 ubfxr0, r0, #7, #1 a6: 4770bx lr a8: f3af 8000 nop.w ac: f3af 8000 nop.w Good machine code would be fun to see, though I might need to brush up on my ARM. /Ulf
Re: [C++] Possible GCC bug
On Wed, Nov 14, 2012 at 6:10 PM, Piotr Wyderski wrote: > The following snippet: > > class A {}; > class B : public A { > >typedef A super; > > public: > >class X {}; > }; > > > class C : public B { > >typedef B super; > >class X : public super::X { > > typedef super::X super; >}; > }; > > compiles without a warning on Comeau and MSVC, but GCC (4.6.1 and > 4.7.1) failes with the following message: > > $ gcc -c bug.cpp > bug.cpp:18:24: error: declaration of ‘typedef class B::X C::X::super’ > [-fpermissive] > bug.cpp:14:14: error: changes meaning of ‘super’ from ‘typedef class B > C::super’ [-fpermissive] > > Should I file a report? > > Best regards, Piotr Here's a two-line TC: typedef struct { typedef int type; } s1; struct S2 { s1::type s1; }; Fails with GCC 4.6.3; succeeds with clang 3.0. Looks like a bug to me. /Ulf