Re: approaches to carry-flag modelling in RTL
On 28/10/11 17:59, Richard Henderson wrote: On 10/28/2011 06:49 AM, Peter Bigot wrote: I'm inclined to follow sparc's lead, but is one or another of the choices more likely to help combine/reload/etc do a better job? I don't know. In the case of RX, we don't model CC_REG until after reload, so combine really doesn't get a shot at it. Be careful here. If you explicitly model the carry flag before reload, you need to have an ADD instruction that can avoid any flags modification. Reload needs to generate such instructions in some cases, and needs to be able to insert them between any two arbitrary insns. I followed rx model but instead RCC before clobber in some cases. To solve this problem I have the following insn_and_split: (define_insn_and_split "addqi3_noclobber" [(set (match_operand:QI 0 "register_operand" "=c") (plus:QI (match_operand:QI 1 "register_operand") (match_operand:QI 2 "immediate_operand")))] "reload_in_progress" "#" "reload_completed" [(parallel [(set (match_dup 0) (match_dup 1)) (clobber (reg:CC RCC))]) (parallel [(set (match_dup 0) (plus:QI (match_dup 0) (match_dup 2))) (clobber (reg:CC RCC))])]) Cheers, -- PMatos
Re: approaches to carry-flag modelling in RTL
On 29/10/11 18:33, Peter Bigot wrote: On Sat, Oct 29, 2011 at 10:58 AM, Richard Henderson wrote: On 10/29/2011 05:41 AM, Peter Bigot wrote: It seems cc0 should probably still be preferred for CISC-style architectures like the MSP430. I'll give that approach a try. I think that's somewhat unfair. Take a close look at the RX and mn10300 ports -- they're what I would call the most up-to-date of the cisc-y ports. Using CC_MODE in this case certainly appears much more complicated than using cc0. Do you believe the effort required to get it right would be justified? Where would the win come from? I can provide you with some insight here. We have a somewhat cisc-y port. I converted from cc0 to REG_CC based on the technique used in RX and MN10300. Cons: * Conversion took 2 weeks and increased the size of the backend due to the duplication of and ; * My somewhat wierd backend has strange sizes so I ended up having to modify core GCC in some parts: to set CC_MODE size and to enable moves clobbering flags; * Some rules are not much harder to read; * compare-elim.c doesn't work 100% since pass assumes rules have register as destination. If yours have a memory operand, elimination won't succeed; Pros: * 0.7% code size reduction; * Hoping that future GCC versions will ease the use of a pseudo FLAGS_REG, improve compare-elim and boost code size reduction further. Some colleagues said the change probably didn't make sense. I am still slightly hopeful in the longer term to show them otherwise. Cheers, -- PMatos
Re: Adding official support into the main tree for SPARC Leon
[CCing David Miller, the SPARC binutils maintainer] > I want to once again ask for write credentials so that > I can submit patches for the sparc-leon architecture: > The first patch is for the 'gcc' repository while the > second patch is for the 'binutils' repository. They are both > related so I think it makes sense to send them together. > I dont have write access to binutils eather so, I thought you > might be able to apply them on both. > Some background: Leon supports the umac/smac instructions. > The Leon3-Ft and Leon4 also support the SMP compare-and-swap (casa) OK, so you're proposing a new 'leon' sub-architecture for binutils. > The appended 2 patches do: > 1. 0001-sparc-leon-Use-Aleon-assembler-switch-for-mcpu-leon-.patch >Append "-Aleon" to the assembler This looks incomplete. Don't you also want to enable the instructions? > 2. 0001-sparc-leon-add-leon-architecture-to-GAS.patch >Define new "leon" processor type in GAS + enable for "leon" >umac/smac and "casa". The configure.tgt change looks useless to me. Other nits: @@ -1668,9 +1671,8 @@ EFPOP2_2 ("efcmpes", 0x055, "e,f"), { "cpop2", F3(2, 0x37, 0), F3(~2, ~0x37, ~1), "[1+2],d", F_ALIAS, v6notv9 }, /* sparclet specific insns */ - -COMMUTEOP ("umac", 0x3e, sparclet), -COMMUTEOP ("smac", 0x3f, sparclet), +COMMUTEOP ("umac", 0x3e, sparclet|MASK_LEON), +COMMUTEOP ("smac", 0x3f, sparclet|MASK_LEON), COMMUTEOP ("umacd", 0x2e, sparclet), COMMUTEOP ("smacd", 0x2f, sparclet), COMMUTEOP ("umuld", 0x09, sparclet), sparclet|leon -{ "casa", F3(3, 0x3c, 0), F3(~3, ~0x3c, ~0), "[1]A,2,d", 0, v9 }, -{ "casa", F3(3, 0x3c, 1), F3(~3, ~0x3c, ~1), "[1]o,2,d", 0, v9 }, +{ "casa", F3(3, 0x3c, 0), F3(~3, ~0x3c, ~0), "[1]A,2,d", 0, v9|MASK_LEON }, +{ "casa", F3(3, 0x3c, 1), F3(~3, ~0x3c, ~1), "[1]o,2,d", 0, v9|MASK_LEON }, v9|leon +{ "cas", F3(3, 0x3c, 0)|ASI(0x80), F3(~3, ~0x3c, ~0)|ASI(~0x80), "[1],2,d", F_ALIAS, v9|MASK_LEON }, /* casa [rs1]ASI_P,rs2,rd */ +{ "casl", F3(3, 0x3c, 0)|ASI(0x88), F3(~3, ~0x3c, ~0)|ASI(~0x88), "[1],2,d", F_ALIAS, v9|MASK_LEON }, /* casa [rs1]ASI_P_L,rs2,rd */ Likewise. -- Eric Botcazou
Re: approaches to carry-flag modelling in RTL
On 31/10/11 05:36, Hans-Peter Nilsson wrote: BTW, I don't think it helps that someone decided the canonical form of a parallel that includes a CC-setter must have the CC-setting *first* (contrasting with the position of clobbers)... How did you reach this conclusion? -- PMatos
What does multiple DW_OP_piece mean in DWARF?
Hi all, Could someone tell me whether the following sequence of DWARF information is correct please, and if it is, how it should be interpreted? GCC emits something like the following [1]: .byte 0x75# DW_OP_breg5 .sleb128 0 .byte 0x93# DW_OP_piece .uleb128 0x4 .byte 0x93# DW_OP_piece .uleb128 0x4 Is it valid to emit two DW_OP_pieces with no separating location? My copy of the spec for DWARF (v4 taken from www.dwarfstd.org) seems to suggest that all pieces must have a location preceeding them. There is also a comment in dwarf2out.c which says: "DW_OP_piece is only added if the location description expression already doesn't end with DW_OP_piece" so it would seem like two contiguous pieces is wrong, but it seems to occur so frequently I wonder if it is a correct output after all. I am working on a symbolic debugger for the Picochip platform, and need to understand why this sequence is emitted, and what the debugger should do with it. I can supply test cases if necessary, but hopefully someone may know if this sequence is intentional or not. thanks, dan. [1] Using `gcc -O1 -dA -S', on versions 4.4.5 and 4.6.2 on x86_64 and Picochip. There are subtle variations, but the same basic pattern keeps reappearing.
Re: What does multiple DW_OP_piece mean in DWARF?
On Mon, Oct 31, 2011 at 12:01:00PM -, Dan Towner wrote: > Could someone tell me whether the following sequence of DWARF information > is correct please, and if it is, how it should be interpreted? GCC emits > something like the following [1]: > > .byte 0x75# DW_OP_breg5 > .sleb128 0 > .byte 0x93# DW_OP_piece > .uleb128 0x4 > .byte 0x93# DW_OP_piece > .uleb128 0x4 > > Is it valid to emit two DW_OP_pieces with no separating location? My copy Yes. > of the spec for DWARF (v4 taken from www.dwarfstd.org) seems to suggest > that all pieces must have a location preceeding them. There is also a > comment in dwarf2out.c which says: > >"DW_OP_piece is only added if the location description expression already > doesn't end with DW_OP_piece" > > so it would seem like two contiguous pieces is wrong, but it seems to > occur so frequently I wonder if it is a correct output after all. DW_OP_piece is preceeded by simple location description (DWARF4, 2.6.1.1). And one of the valid simple location descriptions is empty location description: 2.6.1.1.4 Empty Location Descriptions An empty location description consists of a DWARF expression containing no operations. It represents a piece or all of an object that is present in the source but not in the object code (perhaps due to optimization). So the above means that the first 4 bytes of the variable live in memory pointed by register 5 and the second 4 bytes of the variable are optimized out. Jakub
Re: approaches to carry-flag modelling in RTL
Quoting Hans-Peter Nilsson : I came to the somewhat the same conclusion for CRIS where all insns set condition codes except move to memory and a "add reg1,reg2" (no immediate operand) and to/from special registers: there'll one clobbering and one CC_REG-setting pattern plus a load of others (peephole2's mostly) to get an exact match. What would help is a kind of iterator that (also) affects the form of the insn, so you could match the clobbering and the cc0-setting insn in the same (iterator-using) pattern. match_parallel is useful for recognizing insns with varying clobbers. Unfortunately, it is absolutely useless when you want a reasonable expander, so then you end up having a separate expander pattern, even if all you want to do is have structure of the patch_parallel without extra clobbers. Repeat for a dozen similar instructions, and it gets really annoying.
Re: Mercurial mirror
Simon Wright writes: > The Mercurial mirror at http://gcc.gnu.org/hg/gcc was last updated 11 months > ago at SVN r166522. > > I think it can only cause confusion to have the mirror live but stale; ought > it to be turned off? Agreed. Besides, nowadays you don't need a mirror anymore. I'm having good success with the hgsubversion extension (http://mercurial.selenic.com/wiki/HgSubversion in combination with Subvertpy), at least when only dealing with a single branch. Cloning the whole repo failed after ca. 102000 (of 18) revs with an assertion failure. Rainer -- - Rainer Orth, Center for Biotechnology, Bielefeld University
Re: Potentially merging cxx-mem-model with mainline.
> It does not address other missing aspects of the c++ memory model. In > particular, bitfields are still not compliant with not introducing new > potential data races. Can we treat this as a bugfix to be done during stage2? There is already some support in mainline, but it performs lousy on anything but the most simple of testccases. After much iterations with Richi, we couldn't come to an agreement on the remaining fixes. I would like to take another stab at this during stage2, if it is OK with Richi and others.
Re: # of unexpected failures 768 ?
> Then if I use the resultant compiler from a 4.6.1 build I get a massive > increase in failures on both i386 and Sparc : > > http://gcc.gnu.org/ml/gcc-testresults/2011-10/msg03286.html This is unexpected, results are clean here: http://gcc.gnu.org/ml/gcc-testresults/2011-10/msg03536.html > Also, I see bucket loads of these : > > FAIL: g++.dg/pch/wchar-1.C -O2 -g -I. (internal compiler error) PCH seems to be totally broken. AFAIK nothing has changed between 4.6.1 and 4.6.2 in this area. Can you try with 4.6.1? Did you change the machines? -- Eric Botcazou
Re: # of unexpected failures 768 ?
Dennis Clarke writes: > I'm not too sure how many things changed from 4.6.1 to 4.6.2 but I am > seeing a really large increase in the number of "unexpected failures" on > various tests. > > With 4.6.1 and Solaris I was able to get reasonable results : > > http://gcc.gnu.org/ml/gcc-testresults/2011-07/msg00139.html > > Then if I use the resultant compiler from a 4.6.1 build I get a massive > increase in failures on both i386 and Sparc : > > http://gcc.gnu.org/ml/gcc-testresults/2011-10/msg03286.html FAIL: g++.dg/ext/visibility/fvisibility-inlines-hidden-2.C scan-not-hidden All the scan-not-hidden failures are usually an indication that objdump isn't in your PATH. > This seems blatantly wrong. At what point does one throw out the result of > a bootstrap as not-acceptable ? With any non-zero value for "unexpected > failures" ? There's no such number, only comparisons to other testsuite results. In many cases (e.g. in the scan-not-hidden failures above), there's nothing wrong with the compiler, just with the test environment. And in your case, only two problems account for the vast majority of the failures. > Also, I see bucket loads of these : > > FAIL: g++.dg/pch/wchar-1.C -O2 -g -I. (internal compiler error) > > What should I think about an "internal compiler error" ? This seems fundamentally broken on your machine. With the exception of the largefile.c testcases, those pass everywhere else, so you'd have to debug what's going on there. FAIL: gcc.c-torture/compile/limits-exprparen.c -O0 (internal compiler error) [...] FAIL: gcc.c-torture/compile/limits-structnest.c -O2 (test for excess errors) WARNING: program timed out. Those test cases have excessive stack space or runtime requirements and are known to fail on slow machines or those with default resource limits. Those are known testcase bugs, but nobody cared about this so far ;-( Overall, your results don't look bad to me, once you've installed objdump and investigated the PCH failures. As an asside, I'd suggest to considerably reduce your set of configure options: many of them are the default (like --without-gnu-ld --with-ld=/usr/ccs/bin/ld, --enable-nls, --enable-threads=posix, --enable-shared, --enable-multilib, --host=i386-pc-solaris2.8 --build=i386-pc-solaris2.8) or unnecessary (--enable-stage1-languages=c). I'm uncertain if Solaris 8/x86 still supports bare i386 machines, so it might be better to keep the default of pentiumpro instead. Rainer -- - Rainer Orth, Center for Biotechnology, Bielefeld University
Need help resolving PR target/50906
Hello, I'm looking for advice on how to debug and fix PR50906: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50906 The basic summary is that GCC is generating bad unwinder information or is incorrectly saving registers onto the stack on PowerPC e500v2. Any ordinary function call+return works fine, but exception handling fails miserably trying to unwind the stack (as does GDB). I have not yet been able to figure out if it's a libgcc issue or an actual compiler issue. Any advice would be highly appreciated. If someone would like to help, I can walk you through a quick e500v2 qemu test-harness setup on an amd64 system. Cheers, Kyle Moffett -- Curious about my work on the Debian powerpcspe port? I'm keeping a blog here: http://pureperl.blogspot.com/
Re: # of unexpected failures 768 ?
On 31 October 2011 15:33, Rainer Orth wrote: > As an asside, I'd suggest to considerably reduce your set of configure > options: many of them are the default (like --without-gnu-ld > --with-ld=/usr/ccs/bin/ld, --enable-nls, --enable-threads=posix, > --enable-shared, --enable-multilib, --host=i386-pc-solaris2.8 > --build=i386-pc-solaris2.8) or unnecessary > (--enable-stage1-languages=c). Yes, adding completely redundant options looks like cargo cult programming and just confuses anyone using the compiler who tries to work out how it was configured. > I'm uncertain if Solaris 8/x86 still supports bare i386 machines, so it > might be better to keep the default of pentiumpro instead. Solaris 8 won't run on anything less than pentium, I recently convinced someone else to stop building GCC for i386 on Solaris: http://gcc.gnu.org/ml/gcc-help/2011-10/msg5.html
Re: # of unexpected failures 768 ?
> On 31 October 2011 15:33, Rainer Orth wrote: >> As an asside, I'd suggest to considerably reduce your set of configure >> options: many of them are the default (like --without-gnu-ld >> --with-ld=/usr/ccs/bin/ld, --enable-nls, --enable-threads=posix, >> --enable-shared, --enable-multilib, --host=i386-pc-solaris2.8 >> --build=i386-pc-solaris2.8) or unnecessary >> (--enable-stage1-languages=c). > > Yes, adding completely redundant options looks like cargo cult > programming and just confuses anyone using the compiler who tries to > work out how it was configured. > >> I'm uncertain if Solaris 8/x86 still supports bare i386 machines, so it >> might be better to keep the default of pentiumpro instead. > > Solaris 8 won't run on anything less than pentium, I recently > convinced someone else to stop building GCC for i386 on Solaris: > > http://gcc.gnu.org/ml/gcc-help/2011-10/msg5.html The Os is on Vintage support until March 2012. Also, I never had problems with it before. As for "completely redundant options" I have been building gcc like this for a while. also never a problem before. This is a case of "magic configure incantation" required ? I certainly hope not. Dennis -- -- http://pgp.mit.edu:11371/pks/lookup?op=vindex&search=0x1D936C72FA35B44B +-+---+ | Dennis Clarke | Solaris and Linux and Open Source | | dcla...@blastwave.org | Respect for open standards. | +-+---+
Re: # of unexpected failures 768 ?
Dennis Clarke writes: >>> I'm uncertain if Solaris 8/x86 still supports bare i386 machines, so it >>> might be better to keep the default of pentiumpro instead. >> >> Solaris 8 won't run on anything less than pentium, I recently >> convinced someone else to stop building GCC for i386 on Solaris: >> >> http://gcc.gnu.org/ml/gcc-help/2011-10/msg5.html > > The Os is on Vintage support until March 2012. Also, I never had problems That's not the question (but one reason why Solaris 8 support will be removed after GCC 4.7). As Jonathan documented, you can't run S8 on a bare 80386, so there's no reason the default code generation to that CPU. > with it before. As for "completely redundant options" I have been building > gcc like this for a while. also never a problem before. > > This is a case of "magic configure incantation" required ? I certainly > hope not. Quite the contrary: leave out any configure option unless you absolutely need it because the defaults don't work, document why you need them, and re-check that info for every release. If you think configure should detect the condition on its own, file a bug report for that. These `I've used them for ever' options tend to do more harm than good, and confuse other users that check how your copy of gcc was built. This is especially bad for distributors like yourself, since the number of confused people is far larger than for some company-internal build ;-) Rainer -- - Rainer Orth, Center for Biotechnology, Bielefeld University
Re: # of unexpected failures 768 ?
On 31 October 2011 17:38, Rainer Orth wrote: > Dennis Clarke writes: > I'm uncertain if Solaris 8/x86 still supports bare i386 machines, so it might be better to keep the default of pentiumpro instead. >>> >>> Solaris 8 won't run on anything less than pentium, I recently >>> convinced someone else to stop building GCC for i386 on Solaris: >>> >>> http://gcc.gnu.org/ml/gcc-help/2011-10/msg5.html >> >> The Os is on Vintage support until March 2012. Also, I never had problems > > That's not the question (but one reason why Solaris 8 support will be > removed after GCC 4.7). As Jonathan documented, you can't run S8 on a > bare 80386, so there's no reason the default code generation to that CPU. Quite. In fact there are *very* good reasons not to configure for 80386: libstdc++'s configure uses the default arch being configured for, and disables a number of features on i386 because it doesn't support the required atomic ops. So by configuring for i386 you will distribute a GCC package that is missing useful features, but supports an ancient architecture that Solaris doesn't even run on. You should configure for pentium-pc-solaris2.8 or use --with-arch-32=pentium
Re: Potentially merging cxx-mem-model with mainline.
Aldy Hernandez writes: > Can we treat this as a bugfix to be done during stage2? There is > already some support in mainline, but it performs lousy on anything but > the most simple of testccases. After much iterations with Richi, we > couldn't come to an agreement on the remaining fixes. I would like to > take another stab at this during stage2, if it is OK with Richi and > others. No opinion on your actual question, but note that there is no more stage2. We now go directly from stage1 to stage3. This is just another feature of gcc development seemingly designed to confuse newbies, and evidently even confuses experienced developers. Ian
Re: # of unexpected failures 768 ?
> These `I've used them for ever' options tend to do more harm than good, > and confuse other users that check how your copy of gcc was built. This > is especially bad for distributors like yourself, since the number of > confused people is far larger than for some company-internal build ;-) > > Rainer * nod * Will redo ... and see what I get. Thanks for the input. Dennis -- -- http://pgp.mit.edu:11371/pks/lookup?op=vindex&search=0x1D936C72FA35B44B +-+---+ | Dennis Clarke | Solaris and Linux and Open Source | | dcla...@blastwave.org | Respect for open standards. | +-+---+
Potentially merging the transactional-memory branch into mainline.
This is somewhat of a me-too message for the transactional-memory work. We would also like it to be considered for merging with mainline before the end of stage1. We have a kept a wiki here: http://gcc.gnu.org/wiki/TransactionalMemory What it is == From the wiki... Transactional memory is intended to make programming with threads simpler. As with databases, a transaction is a unit of work that either completes in its entirety or has no effect at all. Further, transactions are isolated from each other such that each transaction sees a consistent view of memory. Transactional memory comes in two forms: a Software Transactional Memory (STM) system uses locks or other standard atomic instructions to do its job. A Hardware Transactional Memory (HTM) system uses features of the cpu to implement the requirements of the transaction directly (e.g. Rock processor). Most HTM systems are best effort, which means that the transaction can fail for unrelated reasons. Thus almost all systems that incorporate HTM also have a STM component and are thus termed Hybrid Transactional Memory systems. The transactional memory system to be implemented in GCC provides single lock atomicity semantics. That is, a program behaves as if a single global lock guards each transaction. What it involves We have implemented the latest spec from the multi-vendor transactional memory group that includes AMD, Intel, Oracle, and others. The last official spec is what is in the wiki above, yet there are some minor changes to the keywords that are currently being finalized in the final document (but have already been agreed upon), and will be published shortly. It is my understanding (Torvald, correct me if I'm wrong), that the current implementation is what has been agreed to by the committee, and has been given a favorable nod by various members of the C++ standardization committee. Most importantly, the keywords are agreed upon. There are changes to the C and C++ front-end, and a software library (libitm) to go along with it. The library works on x86-64, x86-32, and Richard's favorite, Alpha :-). Porting to other architectures should be a straightforward affair. Status == The current implementation runs the common TM benchmarks correctly, albeit there is still work to be done to improve performance. There are a handful of failed compiler tests on the included transactional memory testsuite (g*.dg/tm/*), but they are all missed optimizations, which we hope to have fixed after the merge. What's left === Torvald is working on some recent changes to noexcept, and we should have this working in a few days. I will be removing the cancel-throw construct which didn't make it in the final spec. I should have that done tomorrow. The final word == Seeing that a global maintainer has been lead on this for a while, I suspect there isn't much to review formally. I believe the only bits that Richard isn't directly responsible for are the C++ front-end changes. So what is the opinion/consensus on merging the branch? It would be nice to get this infrastructure in place as well so we can get people to start using it, and then we can work out any issues that arise. I have no idea how this happened, but apparently I'm on the hook for merging both the cxx-mem-model and this branch (if/when one/both get approved). If this gets approved, I'd prefer to get the cxx-mem-model branch merged first, and the transactional-memory branch later during the week. I will be partially available during the weekend, and definitely during next week.
Re: approaches to carry-flag modelling in RTL
On Mon, 31 Oct 2011, Paulo J. Matos wrote: > On 31/10/11 05:36, Hans-Peter Nilsson wrote: > > BTW, I > > don't think it helps that someone decided the canonical form of > > a parallel that includes a CC-setter must have the CC-setting > > *first* (contrasting with the position of clobbers)... > > How did you reach this conclusion? Not obvious or maybe I was unclear as to what I alluded? In the below insn-bodies, "sub" is the insn that sets cc0 as a side-effect. Supposed canonical form : (parallel [(set cc_reg) (compare ...)) (set destreg) (sub ...))]) and: (parallel [(set destreg) (sub ...)) (clobber cc_reg)]) But IMHO it'd be easier (for most values of "easier") to combine both patterns with that non-existing mechanism (and no, I don't count match_parallel) if we instead canonicalized on the CC_REG set being the same as the clobber position: (parallel [(set destreg) (sub ...)) (set cc_reg) (compare ...))]) with: (parallel [(set destreg) (sub ...)) (clobber cc_reg)]) brgds, H-P
Re: C++11 no longer experimental
I've checked this in with some tweaks. Jason