Re: Threading the compiler
On Fri, 10 Nov 2006, Mike Stump wrote: On Nov 10, 2006, at 12:46 PM, H. J. Lu wrote: Will use C++ help or hurt compiler parallelism? Does it really matter? I'm not an expert, but, in the simple world I want, I want it to not matter in the least. For the people writing most code in the compiler, I want clear simple rules for them to follow. Good luck. As I understand it, GCC's old manual memory manager was dumped in favour of garbage collection because it was too error-prone. I suspect there will be similar correctness issues with threads -- they're hard to get right, and GCC is big, complex, and worked on by lots of people. For example, google uses mapreduce http://labs.google.com/papers/ mapreduce.html as a primitive, and there are a few experts that manage that code, and everyone else just mindlessly uses it. The rules are explained to them, and they just follow the rules and it just works. No locking, no atomic, no volatile, no cleaver lock free code, no algorithmic changes (other than decomposing into isolated composable parts) . I'd like something similar for us. I'd be really impressed if someone could work out such a system that works for a program such as GCC. Nick
Re: Level to do such a modification...
On Wed, 24 Jan 2007, [GB2312] ÎâêØ wrote: I am working on gcc 4.0.0. I want to use gcc to intercept each call to read, and taint the data readed in. For example: transform read(fd, buf, size) to read(fd, buf, size) if(is_socket(fd)) taint(buf, size) So, what is the best suitable level to do this modification in gcc? My own thought is in finish_function, before calling c_genericize,as I discovered that in c front-end, there's no GENERIC tree... In c_genericize, it directly calls gimplify_function_tree. Are you sure you want to do this in GCC? You might find it easier to use a dynamic binary instrumentation framework such as Valgrind or Pin to do this kind of thing. Nick
Re: Level to do such a modification...
On Wed, 24 Jan 2007, [GB2312] ÎâêØ wrote: I know valgrind, it is an emulator ,but we are restricted not to use an emulator. :-( Well, for some definition of "emulator". Nick
Re: We're out of tree codes; now what?
On Mon, 19 Mar 2007, Doug Gregor wrote: > It's going to have a big performance impact. To extract a 9-bit value, > the compiler will need to do a lot of masking every time it accesses > the TREE_CODE. So, about 16% slower with --enable-checking, 5% slower with --disable-checking. Subcodes might still be the way to go, but I'm feeling less bad about the 9-bit tree code option. As an interested outsider: GCC's compile-time speed has been gradually decreasing for a while now. It seems to be acknowledged as an undesirable thing, but not much has happened to change it. AIUI, this is largely because it's very difficult. Nonetheless, seeing a 5% slow-down caused by fixing a data structure design bogon is disappointing. Nick
Re: We're out of tree codes; now what?
On Mon, 19 Mar 2007, Doug Gregor wrote: But what is the solution? We can complain about performance all we want (and we all love to do this), but without a plan to fix it we're just wasting effort. Shall we reject every patch that causes a slow down? Hold up releases if they are slower than their predecessors? Stop work on extensions, optimizations, and bug fixes until we get our compile-time performance back to some predetermined level? We have hit a hard limit in the design of GCC. We need to either use more memory, use more compilation time, re-architect non-trivial portions of GCC, remove functionality, or come up with something very, very clever. Pick one, but if the pick the last one, you have to specify what "something very, very clever" is, because we seem to be running short on ideas. GCC is a very ambitious compiler: - it supports a lot of platforms - it supports a lot of languages However, most users do not use most of those combinations. The problem is that supporting all these combinations hurts the specific combinations. For example, I almost always compile C, and usually on x86 or AMD64. So I don't care if the C++ compiler has improved, or a new language front-end is added, or whatever. (Other users will have different priorities.) As for what is best to do, I don't know. But I do know that complexity is bad, and that GCC is very complex. You are absolutely right about there being hard limits. There are trade-offs required. Whether the current and ongoing trade-offs are the right ones is an open question. Nick
GCC priorities [Was Re: We're out of tree codes; now what?]
On Tue, 20 Mar 2007, Nicholas Nethercote wrote: GCC is a very ambitious compiler: - it supports a lot of platforms - it supports a lot of languages However, most users do not use most of those combinations. The problem is that supporting all these combinations hurts the specific combinations. Nobody responded to this particular point, which surprised me. I looked up the GCC mission statement (http://gcc.gnu.org/gccmission.html). It has the following "Design and Development Goals" * New languages * New optimizations * New targets * Improved runtime libraries * Faster debug cycle * Various other infrastructure improvements I think they're terrible: - "New languages" -- why? Just because you can? In theory, adding a new language can be free, but in practice it never is. - "New optimizations?" I assume this means "optimization passes". Users don't care about optimization passes. Users care about performance. Optimizations happen to be the main way to achieve that, but substituting the mechanism for the end goal in a mission statement sends the wrong message. - "New targets" -- this one is better, but some qualification would help. I understand that the goal is for GCC to be the universally available compiler, but it could be clarified that it's targets that are desired by users. - They're vague -- "improved runtime libraries"? "various other infrastructure improvements"? These phrases are meaningless. Why not just write "a really awesome compiler"? - There's no notion that you can't have everything. For something to be a high priority, you have to make something else a lower priority. This list is just "faster! better! more!" In particular, to return to the original point of this thread, the "faster debug cycle" has been suffering horribly (due to compile-time performance regressions). - They haven't been updated since 1999-04-22. Here are some suggestions for more suitable priorities. They're all characteristics, rather than features. They're far from perfect, but I think they're better than the goals above. I haven't put them in any particular order. - Correctness w.r.t. language definitions (ie. accept correct code, reject incorrect code) - Correctness of the compiler (ie. no compiler crashes) - Correctness of generated code (ie. compiled code should do the right thing) - Performance of the compiler (time and memory) - Performance of generated code (time and memory) - Performance of building the compiler (time and memory) - Support of existing language extensions - Addition of new language extensions - Support of existing languages: C, C++, Objective C, Fortran, Ada, Java (and any others I've missed) - Support of new languages - Support of existing platforms (a.k.a. portability) - Support of new platforms (a.k.a. portability) - Design and code complexity - Maintainability - Debugging support (eg. generating correct debugging info) - Profiling support - Quality of error messages (eg. uninitialized variables) - Support for other extensions (eg. mudflap) You can split these up or qualify them more, eg. for "performance of the compiler" you might distinguish between -O0 and -O2. The key idea is that you can't have all of these. For example, supporting many languages and platforms increases complexity, adds more code (which slows down build times), and can hurt performance (as the tree codes example has shown). It also takes resources that then are not used for improving other aspects. Another example: there was a suggested SoC project yesterday for an incremental C++ parser. That could speed up compile-time performance in some cases, but at the cost of increased design and code complexity. This idea is simple, and I'm sure many of you understand it individually. But it appears to me, as an observer of GCC development, that GCC developers as a group don't understand this. On Mon, 19 Mar 2007, Doug Gregor wrote: We have hit a hard limit in the design of GCC. We need to either use more memory, use more compilation time, re-architect non-trivial portions of GCC, remove functionality, or come up with something very, very clever. Pick one, but if the pick the last one, you have to specify what "something very, very clever" is, because we seem to be running short on ideas. Doug was talking about the tree code issue, but this paragraph sums up the whole situation perfectly. Sometimes it's a good idea to stand back and look at the bigger picture, rather than just running on the "gotta fix another bug, gotta add another feature" treadmill. Nick
Re: GCC priorities [Was Re: We're out of tree codes; now what?]
On Thu, 21 Mar 2007, Ian Lance Taylor wrote: I think you may misunderstand the mission statement. The mission statement is not a technical roadmap. It's a statement of general goals. If the community has a serious disagreement, the mission statement can sometimes help clarify matters. [...] The problem as I see it is that different gcc developers have different goals. And it is a natural tendency for developers to care more about their specific goals than any developer cares about the general goals. The effect is that specific goals get implemented at the expense of general goals. Exactly. I'm viewing the mission statement as the moral equivalent of a constitution -- the highest guidelines that you fall back on when everything else fails. Your first paragraph above indicates that you view it similarly. But it's currently so vague that I don't imagine it's much use... it's like a legal constitution that says "be nice to everybody". Many (most?) open source projects have one or a few benevolent dictators that control things. GCC doesn't (the steering committee is too far removed from day-to-day decisions), and I think its development suffers for it at times. Maybe I'm wrong, but to return to the original topic, I don't expect to see compile-time performance improve significantly in any future release. Nick
Re: GCC priorities [Was Re: We're out of tree codes; now what?]
On Wed, 21 Mar 2007, Paul Brook wrote: The problem is that I don't think writing a detailed "mission statement" is actually going to help anything. It's either going to be gcc contributors writing down what they're doing anyway, or something invented by the SC or FSF. I the latter case nothing's going to change because neither the SC nor the FSF have any practical means of compelling contributors to work on a particular feature. It's been said before that Mark (the GCC release manager) has no real power to make anything actually happen. All he can do is delay the release and hope things get better. Then it will continue to be interesting, if painful, to watch. Nick
Re: We're out of tree codes; now what?
Several alternatives were tried -- the sub-code approach, the 9-bit approach, the 16-bit approach. It might be interesting to try using Cachegrind or Callgrind to better understand why the performance changes occurred. Nick
Re: memory checkers and gcc support
On Mon, 14 Mar 2005, J. Hart wrote: Valgrind is an excellent product as far as it goes, but is x86 only, and apparently lacks the sort of stack checking that Purify and Checker have. Valgrind is currently being officially ported to several other platforms. We hope to have AMD64/Linux and PPC32/Linux ports complete by the middle of this year. It still won't be able to do stack checking, though. N
Re: gcc cache misses [was: Re: OT: How is memory latency important on AMD64 box while compiling large C/C++ sources]
On Tue, 12 Apr 2005, Karel Gardas wrote: cachegrind can also be used to estimate the number (though, not sure how accurate it is, possibly not very). I use Shark to actually get the real number. Perhaps it's possible that cachegrind is wrong or cache misses differ from platform to platform, but I would tell that I get very good numbers for gcc running on x86 platform: In my experience Cachegrind can give pretty good numbers for L1 misses, espcially D1, but the L2 misses tend to vary more. I saw this with comparisons against the real numbers reported by the performance counters on an Athlon. However, Cachegrind certainly makes a number of approximations (see section 3.3.7 of http://www.valgrind.org/docs/phd2004.pdf) and so you shouldn't trust it too much. It should give reasonable numbers though. N
Unnecessary sign- and zero-extensions in GCC?
Hi, I've been looking at GCC's use of sign-extensions when dealing with integers smaller than a machine word size. It looks like there is room for improvement. Consider this C function: short g(short x) { short i; for (i = 0; i < 10; i++) { x += i; } return x; } On x86, using a GCC 4.0.0 20050130, with -O2 I get this code: g: pushl %ebp xorl%edx, %edx movl%esp, %ebp movswl 8(%ebp),%ecx .p2align 4,,15 .L2: leal(%ecx,%edx), %eax movswl %ax,%ecx# 1 leal1(%edx), %eax movzwl %ax, %eax # 2 cmpw$10, %ax movswl %ax,%edx# 3 jne .L2 popl%ebp movl%ecx, %eax ret .size g, .-g .p2align 4,,15 The three extensions (#1, #2, #3) here are unnecessarily conservative. This would be better: g: pushl %ebp xorl%edx, %edx movl%esp, %ebp movswl 8(%ebp),%ecx .p2align 4,,15 .L2: leal(%ecx,%edx), %ecx # x += i leal1(%edx), %edx # i++ cmpw$10, %dx# i < 10 ? jne .L2 popl%ebp movswl %cx, %eax ret GCC's approach seems to be *eager*, in that sign-extensions are done immediately after sub-word operations. This ensures that the high bits of a register holding a sub-word value are always valid. An alternative is to allow the high bits of registers holding sub-word values to be "junk", and do sign-extensions *lazily*, only before operations in which any "junk" high bits could adversely affect the result. For example, if you do a right shift on a value with "junk" high bits you have to sign/zero-extend it first, because high bits in the operands can affect low bits in the result. The same is true of division. In contrast, an addition of two 16-bit values with "junk" high bits is ok if the result is also a 16-bit value. The same is true of subtraction, multiplication and logical ops. The reason is that for these operations, the low 16 bits of the result do not depend on the high 16 bits of the operands. Although you can construct examples where the eager approach gives better code, in general I think the lazy approach results in better code, such as in the above example. Is there a particular reason why GCC uses the eager approach? Maybe it has to do with the form of GCC's intermediate representation? Or are there are some subtleties to do with the lazy approach that I have overlooked? Or maybe I've misunderstood GCC's approach. Any comments are appreciated. Thanks for your help. Nick
Re: Unnecessary sign- and zero-extensions in GCC?
On Mon, 18 Apr 2005, Steven Bosscher wrote: I've been looking at GCC's use of sign-extensions when dealing with integers smaller than a machine word size. It looks like there is room for improvement. Is your problem the same as the one described on one of the Wiki pages, "http://gcc.gnu.org/wiki/Exploiting Dual Mode Operation"? I think so, yes. Nick
Re: Do CO++ signed types have modulo semantics?
On Tue, 28 Jun 2005, Joe Buck wrote: There is no such assumption. Rather, we assume that overflow does not occur about what happens on overflow. Then, for the case where overflow does not occur, we get fast code. For many cases where overflow occurs with a 32-bit int, our optimized program behaves the same as if we had a wider int. In fact, the program will work as if we had 33-bit ints. Far from producing a useless result, the optimized program has consistent behavior over a broader range. To see this, consider what the program does with a=MAX_INT, b=MAX_INT-1. My optimized version, which always calls blah(b+1), which is what a 33-bit int machine would do. It does not trap. This point about 33-bit machines is interesting because it raises an optimisation scenario that hasn't been mentioned so far. Consider doing 32-bit integer arithmetic on 64-bit machines which only support 64-bit arithmetic instructions. On such machines you have to use sign-extensions or zero-extensions after 64-bit operations to ensure wrap-around semantics (unless you can prove that the operation will not overflow the bottom 32 bits, or that the value will not be used in a way that exposes the fact you're using 64-bit arithmetic). But -- if I have understood correctly -- if the 32-bit values are signed integers, a C compiler for such a machine could legitimately omit the sign-extension. Whereas for unsigned 32-bit values the C standard implies that you must zero-extend afterwards. I hadn't realised that. This has been an enlightening thread :) Nick
Re: Do C++ signed types have modulo semantics?
On Wed, 29 Jun 2005, Daniel Berlin wrote: So i would advise anyone arguing against turning on -fwrapv simply because it doesn't seem to hurt us at O2. And i'll again point out that the exact opposite is the default in every other compiler i'm aware of. Sorry, I couldn't parse those sentences... are you saying the -fwrapv behaviour (ie. wrap-on-signed-integer-overflow) is the default or not the default in these other compilers? XLC at O2 has qstrict_induction on by default (the equivalent), and warns the user when it sees a loop where it's making the assumption[1] Which assumption? The XLC people told me since they turned this on in 1998, they have had one real piece of code where it actually mattered, and that was a char induction variable. ICC does the same, though i don't think it bothers to warn. Open64 does the same, but no warning. Not sure about Sun CC, but i'd be very surprised if they did it. Personally, i only care about wrapping for induction variables. If you guys want to leave regular variables to do whatever, fine. Are you saying you don't want induction variables to have to wrap, but you don't care about non-induction variables? Sorry if I'm being dim... I think it's excellent you're discussing what other compilers do, I just can't understand what you've said as expressed :) Nick
Re: signed is undefined and has been since 1992 (in GCC)
On Sat, 2 Jul 2005, Florian Weimer wrote: I am puzzled, why would *ANYONE* who knows C use int rather than unsigned if they want wrap around semantics? Both OpenSSL and Apache programmers did this, in carefully reviewed code which was written in response to a security report. They simply didn't know that there is a potential problem. The reason for this gap in knowledge isn't quite clear to me. I've done a lot of C programming in the last three years, and for my day job I'm working on a C compiler (albeit in parts that are not very C specific), and I didn't know that signed overflow is undefined. Why not? I guess I never heard otherwise and I just assumed it would wrap due to two's complement arithmetic. I don't think I've ever written a serious C program that required wrap-around on overflow, though. Nick
Where does the C standard describe overflow of signed integers?
Hi, There was recently a very long thread about the overflow behaviour of signed integers in C. Apparently this is undefined according to the C standard. I searched the standard on this matter, and while I did find some paragraphs that described how unsigned integers must wrap around upon overflow, I couldn't find anything explicit about signed integers. Can someone point me to the relevant part(s) of the standard? Also, does anyone know what the required behaviour for Fortran integers is on overflow? (I realise this isn't exactly on-topic for this list, but I thought it reasonable to ask since this topic was discussed so enthusiastically recently :) Thanks very much. Nick
RE: Where does the C standard describe overflow of signed integers?
On Mon, 11 Jul 2005, Dave Korn wrote: There was recently a very long thread about the overflow behaviour of signed integers in C. Apparently this is undefined according to the C standard. I searched the standard on this matter, and while I did find some paragraphs that described how unsigned integers must wrap around upon overflow, I couldn't find anything explicit about signed integers. Dave, Nathan and Paul: thanks for the quick replies. The difference between signed and unsigned integer overflow is a little unclearly expressed, I think. 3.4.3/3 says: "EXAMPLE An example of undefined behavior is the behavior on integer overflow" 6.5/5 says: "If an _exceptional condition_ occurs during the evaluation of an expression (that is, if the result is not mathematically defined or not in the range of representable values for its type), the behavior is undefined." These two paragraphs would seem to indicate that overflow is undefined for both signed and unsigned integers. But then 6.2.5 para 9, sentence 2 says: "A computation involving unsigned operands can never overflow, because a result that cannot be represented by the resulting unsigned integer type is reduced modulo the number that is one greater than the largest value that can be represented by the resulting type." Which requires that unsigned ints must wrap on overflow. (Actually, I guess it defines "overflow" such that unsigned ints never "overflow", so 3.4.3/3 and 6.5/5 don't apply!) But I think the paragraphs together are good enough to communicate that: unsigned ints must wrap on overflow, signed ints need not. Thanks again for your help. N
Re: Some notes on the Wiki
On Mon, 11 Jul 2005, Daniel Berlin wrote: Also, a web-browser is much slower than an info-browser, especially when doing searchs. You must be close to the only user i've met who uses the info browser :) I use it. Info pages suck in many ways, but they're fast to load from an xterm, fast to search, and even faster when you know where they are in the docs (eg. I find myself looking at the GCC C extensions quite often, and I can get there very quickly). Nick
Some tests in gcc.c-torture rely on undefined behaviour?
Hi, I've been looking at the gcc.c-torture tests, it seems some of them rely on undefined behaviour. For example, 920612-1.c looks like this: f(j)int j;{return++j>0;} main(){if(f((~0U)>>1))abort();exit(0);} AIUI, this passes the largest possible positive integer to f(), which then increments it, causing signed overflow, the result of which is undefined. The test passes if signed overflow wraps. 930529-1.c is similar -- again the maximum positive integer is incremented. 20020508-2.c has the following code (abridged): #ifndef CHAR_BIT #define CHAR_BIT 8 #endif #define ROR(a,b) (((a) >> (b)) | ((a) << ((sizeof (a) * CHAR_BIT) - (b #define INT_VALUE ((int)0x1234) #define SHIFT1 4 int i = INT_VALUE; int shift1 = SHIFT1; if (ROR (i, shift1) != ROR (INT_VALUE, SHIFT1)) abort (); Similarly, the left-shifting in ROR causes signed integer overflow (I think) and so ROR relies on undefined behaviour. 20020508-3.c is similar. My question is: what exactly is gcc.c-torture testing? It seems to be testing more than just C standard compliance, but also certain undefined-but-desired behaviours, such as wrap-on-signed-overflow (at least in some cases). Is that right? If so, other C compilers could be correct with respect to the C standard but not pass all the tests. I couldn't find any kind of README or description about the tests that covered this point, so I'd appreciate any explanations. Thanks. Nick
Re: Some tests in gcc.c-torture rely on undefined behaviour?
On Tue, 12 Jul 2005, Joseph S. Myers wrote: My question is: what exactly is gcc.c-torture testing? It seems to be That GNU C code compiles or executes as expected for GNU C. Is there a definition for GNU C? implement-c.texi and extend.texi have some information about this, are there any other sources? Thanks. Nick
RE: GNU Fortran Compiler
On Fri, 5 Aug 2005, Dave Korn wrote: Hallo, what must I do for becomming the GNU Fortran Compiler? Sincerely, Hans. To become the compiler, you must _think_ like the compiler. It's an easy mistake to make for Germans speaking English, because the German verb "bekommen" means "to get, obtain, receive"... Nick
Re: More NEWS on GCC?
On Tue, 30 Aug 2005, Rafael ?vila de Esp?ndola wrote: One problem is that compiler technology generally requires more background then OS: 1) the new O(1) scheduler 2) the new PCI interface or 1) the new SSA based intermediate representation 2) the new DFA based pipeline hazard recognizer I don't think the compiler technology is much more complicated to describe than the OS technology. All four of those concepts would require at least a little background explanation to non-OS and non-compiler people. Timothy has a good point. GCC is arguably more important than the Linux kernel, yet it gets very little press and recognition. Note also that everybody loves Linux (more or less) but people mostly bitch about GCC if they think about it at all. Perhaps these facts are connected? Nick
Re: Adding debug symbols causes segmentation faults with GCC-4.1 and MIPS...
On Tue, 13 Sep 2005, Steven J. Hill wrote: You might want to first make sure that your program has no memory access errors. You could try building it for x86 and debugging with valgrind, to see if that catches anything. A good idea. I built it for x86. Unfortunately, from the output it appears that 'clone' is not supported, or rather not very well. Here is a link to the source: http://www.uclibc.org/cgi-bin/viewcvs.cgi/*checkout*/trunk/uClibc/test/unistd/clone.c?content-type=text%2Fplain&rev=10696 The only interesting output is: ==4032== Syscall param clone(parent_tidptr) contains uninitialised byte(s) ==4032==at 0x1BA108AC: clone (clone.S:100) ==4032==by 0x1B96C412: __libc_start_main (libc-start.c:250) ==4032==by 0x80484A0: ??? (start.S:119) ==4032== ==4032== Syscall param clone(tlsinfo) contains uninitialised byte(s) ==4032==at 0x1BA108AC: clone (clone.S:100) ==4032==by 0x1B96C412: __libc_start_main (libc-start.c:250) ==4032==by 0x80484A0: ??? (start.S:119) ==4032== ==4032== Syscall param clone(child_tidptr) contains uninitialised byte(s) ==4032==at 0x1BA108AC: clone (clone.S:100) ==4032==by 0x1B96C412: __libc_start_main (libc-start.c:250) ==4032==by 0x80484A0: ??? (start.S:119) ==4032== ==4032== Unsupported clone() flags: 0x0 ==4032== ==4032== The only supported clone() uses are: ==4032== - via a threads library (LinuxThreads or NPTL) ==4032== - via the implementation of fork or vfork ==4032== - for the Quadrics Elan3 user-space driver I don't feel like I can trust the output since valgrind admitted it does not do clone very well. It handles the most common invocations of clone, but this program passes in 0 for 'flags', which seems odd. Nick
Re: Wishlish: GCC option for explicit booleans
On Sat, 1 Oct 2005, [EMAIL PROTECTED] wrote: C++ would be a better language if the boolean type did not implicitly convert from int. For example, many novice programmers make the mistake. if (i = j) dosomething(); // Should be i == j If conversion to boolean required explicit this would all be solved. It would mean all the old code with expressions like "while (p) ... " would need to be changed to "while (p != NULL) ...". But I think the change would be well justified. What about a GCC option to turn off implicit conversion to boolean? [~] more a.cpp int main(void) { int i = 0, j = 0; if (i = j) return 0; else return 1; } [~] g++ -Wall a.cpp a.cpp: In function `int main()': a.cpp:4: warning: suggest parentheses around assignment used as truth value Nick
Re: Abnormal behavior of malloc in gcc-3.2.2
On Mon, 21 Nov 2005, Giovanni Bajo wrote: I didnt get your point. I am allocating space only for 400 inregers then as soon as in the loop if it crosses the value of 400 , it should have given a segementation voilation ? No. For that to happen, you need some memory checker. GCC has -fmudflap, try with that. Recent versions of glibc also have their internal memory buffer checker, it probably triggers the segmentation fault when you free the buffer which you have overflown. Valgrind will find this error too (www.valgrind.org). Nick
Re: Performance regression testing?
On Mon, 28 Nov 2005, Joe Buck wrote: On Mon, 28 Nov 2005, Mark Mitchell wrote: We're collectively putting a lot of energy into performance improvements in GCC. Sometimes, a performance gain from one patch gets undone by another patch -- which is itself often doing something else beneficial. People have mentioned to me that we require people to run regression tests for correctness, but that we don't really have anything equivalent for performance. It would be possible to detect performance regression after fact, but soon enough to look at reverting patches. For example, given multiple machines doing SPEC benchmark runs every night, the alarm could be raised if a significant performance regression is detected. To guard against noise from machine hiccups, two different machines would have to report a regression to raise the alarm. But the big problem is the non-freeness of SPEC; ideally there would be a benchmark that ... ... everyone can download and run ... is reasonably fast ... is non-trivial Yes! This would be very useful for other free software projects. Another possible requirement is that the tests are not too large; it would be nice to include them in the source code of one's project for easier integration. As a strawman, perhaps we could add a small integer program (bzip?) and a small floating-point program to the testsuite, and have DejaGNU print out the number of iterations of each that run in 10 seconds. Would that really catch much? I've been thinking about this kind of thing recently for Valgrind. I was thinking that a combination of real programs and artificial microbenchmarks would be good. The microbenchmarks would be like the GCC (correctness) torture tests -- a collection of programs, added to over time, each one demonstrating a prior performance bug. You could start it off with a few tests containing things like key inner loops extracted from programs such as bzip2. Measuring the programs and categorizing regressions is tricky. It's possible that the artificial tests would be small enough that any regression would be obvious (eg. failing to remove that extra instruction would cause a 10% slowdown). And CSiBE-style graphing is very effective for seeing trends. Nick
Re: Echte Lokaliserung der Programmbausprache/ Real Localisation of Programming Language
On Mon, 6 Oct 2008, Kai Henningsen wrote: You're not the first person to come up with this idea, and you probably won't be the last, but it's a misbegotten idea, and there's In fact, I believe it came up around the time when COBOL was invented. And you'll notice that it didn't get implemented back then, even though people thought it wouldn't be all that hard to do. a very good reason why it hasn't been done before, and that's not Actually, that's not true. In my Apple ][+ days, I've seen it done with BASIC. For some reason, it never amounted to more than a toy. I'm sure someone somewhere is doing it to a programming language right now. Poor language. Early versions of AppleScript -- a "naturalistic" language with lots of keywords -- supported a french "dialect" and even Japanese. See page 20 of http://www.cs.utexas.edu/~wcook/Drafts/2006/ashopl.pdf AIUI, the foreign language support was dropped at some point. Nick
Re: change to gcc from lcc
On Tue, 18 Nov 2008, H.J. Lu wrote: I used malloc to create my arrays instead of creating the in the stack. My program is working now but it is very slow. I use two-dimensional arrays. The way I access element (i,j) is: array_name[i*row_length+j] The server that I use has 16GB ram. The ulimit -a command gives the following output: time(seconds)unlimited file(blocks) unlimited data(kbytes) unlimited stack(kbytes)8192 That limits stack to 8MB. Please change it to 1GB. Why? Nick
Re: Questions about another usage of GCOV
On Sat, 17 Jun 2006, Marc Alff wrote: 2) Licensing For technical reasons, I can not use the gcov library itself, and plan to implement code to read/write the files the GCOV program needs. Then why do you need to use the gcov file formats? Nick
Re: [RFC] Program Bounds Checking
On Thu, 28 Sep 2006, Tzi-cker Chiueh wrote: We have considered the bound instruction in the CASH project. But we found that bound instruction is slower than the six normal instructions it is meant to replace for range checking. For example, the bound instruction on a 1.1 GHz PIII machine requires 7-8 clock cycles while the 6 equivalent instructions require 6-7 clock cycles. We have not tested it on newer processors, though. I would guess it would be as slow or worse. 'bound' is an extremely rarely used instruction, and so will not be optimised-for at all. Nick
Re: a mudflap experiment on freebsd
On Wed, 23 Feb 2005, Doug Graham wrote: Regarding memory consumption, perhaps libmudflap's default backtrace parameter should be set to zero, for both speed and space reasons. If it's storing all the backtraces that is burning up all the memory, another approach might be to keep a separate hash table for storing backtraces, then hash new backtraces and see if the same backtrace already exists from a previous call to malloc. If so, no need to allocate a new one. That's essentially what the hprof Java profiler does, and it works pretty well. The average application might have many thousands of mallocs, but only a few distinct backtraces. Also, saving program counters instead of symbolic names in the backtrace would probably save a lot of memory, and might also make hashing the backtrace cheaper. Or the strings containing the symbolic names could be internalized rather than allocating a new string for every symbolic name in every backtrace. That's pretty much what Valgrind does -- each backtrace is just N program counters, and there's a symbol table full of strings, more or less. When a backtrace is printed, Valgrind looks up the symbol table, which is not particularly fast, but printing error messages is relatively rare. With respect to backtrace sizes, Valgrind's default is 4 but almost everybody immediately sets that to something much higher. Many people set it to something like 40 so as to get the biggest possible trace. So I think that trying to get around the problem of memory consumption by making the default backtrace size small won't work. N
Re: Memory leaks in compiler
On Wed, 16 Jan 2008, Tom Tromey wrote: Kaveh> A valgrind suppression only silences the error for valgrind. What if Kaveh> someone uses another memory checking tool? Better to fix it for real Kaveh> IMHO. Add suppressions for all of them. Any decent memory checker has to account for the reality that many programs don't bother freeing memory before exit. Valgrind (Memcheck) generally only complains about unfreed blocks for which there is no pointer in live data, ie. truly leaked ones. But it's not perfect, there are various ways it can be fooled. Nick
Re: Official GCC git repository
On Thu, 13 Mar 2008, David Woodhouse wrote: I could never understand why anyone would use anything but CVS (if that works for them), or git. The VCS-du-jour craze just confuses me. Version control is complicated, much more so than it first appears. There's a very large design space. Knowing that, it's not surprising that there are so many different VCS systems, embodying lots of different approaches. Nick
Re: Very Fast: Directly Coded Lexical Analyzer
On Thu, 31 May 2007, Andrew Haley wrote: No. Speed is always measured in reciprocal units of time: s^-1. A program that runs in 10 seconds has a speed of 0.1 s^-1. Thus, 200% is (0.1 * 200/100) s^-1 faster, giving a speed of 0.3 s^-1. Um, 0.1 * 200/100 = 0.2. Amdahl's Law says: speedup = t_old / t_new Yet people often express it as a percentage, which is confusing, as we've just seen. They write "1.2x faster" as "20% faster" and "3x faster" as "200% faster". Another performance measure is reduction in execution time: reduction = 1 - (t_new / t_old) That is more natural to express as a percentage. But it's usually not what you want -- it's not what most people mean intuitively when they say "X is faster than Y" by a particular amount. Growth of things is often expressed as a percentage, eg. saying "my wealth is 200% larger than last year" is fairly clear as meaning "3x larger". It's clear because there's only one possible measurement: size. That's why people use it in finance. But when you're dealing with how fast programs run, because there are two possible measures ("speedup" and "reduction") a percentage is much less clear. To confuse things further, if you use percentage for speedup, the two measures give very similar results, *for small performance improvements*. For example: t_old = 100s t_new = 90s speed-up = 1.11 (sometimes strangely written as 11%) reduction = 10% But they diverge as the performance improvements get better. I once saw a talk in which the speaker said he had sped up a system by 75%. He really meant he had reduced the execution time by 75%, which meant it was running 4x faster. But I think everyone assumed a 1.75x speed-up until I asked for a clarification. Since this "talk" was actually a thesis defense I think my question helped him significantly :) It's amazing how much academic work doesn't make it clear what they mean by "10% faster". And the responses to this thread show how confusing percentages are. I encourage everyone to use Amdahl's law for speed-up and avoid percentages when talking about program performance. If you say "I made it 3x faster" no-one will ever be confused. Nick
Re: RFH: GPLv3
On Fri, 13 Jul 2007, Alexandre Oliva wrote: One way to view it: the license is a feature. Therefore changing the license is changing a feature. Every release of GCC in the past decade (and then some) was GPLv2+. GPLv3 has always been one of the options. Anyone who had their heads in the sand for the past 18 months when GPLv3 was being publicly discussed and developed, or wasn't at the GCC Summit last year when I mentioned that the FSF would most certainly want to upgrade the license of every project whose copyright it held as soon as GPLv3 was ready, may indeed consider the license upgrade as a surprising new feature. But anyone who wanted to participate was welcome to do so, and GPLv3 shouldn't be a surprise for anyone who did, or even just watched it from a distance. Now, why should we weaken our defenses for the sake of those who didn't plan for something that could have been so easily forecast 18 months ago, and that was even planned to be finished 4 months ago? Heck, the last-call draft, published one month before the final release, was so close to the final release that non-insider lawyers who were on top of the process managed to emit solid opinions about the final license the day after it was released. It's those who didn't do their homework and didn't plan ahead for this predictable upgrade who should be burdened now, rather than all of us having to accept weaker defenses for our freedoms or facing additional requirements on patches or backports. It was all GPLv2+, and this means permission for *anyone* to upgrade to GPLv3+. The license upgrade path is the easy path, and that's by design. I was just suggesting a rationale for choosing a version number. Nick
Re: RFH: GPLv3
On Thu, 12 Jul 2007, Michael Eager wrote: 3. After GCC 4.2.1 is released, we will renumber the branch to GCC 4.3. What would have been GCC 4.2.2 will instead be GCC 4.3.3, to try to emphasize the GPLv3 switch. The GCC mainline will then be GCC 4.4. This seems to confabulate the meaning of version numbers to now mean something about licensing. The difference between 4.2.1 and 4.2.2 would normally be considered a minor bug fix release, under this scheme of calling it 4.3.3, one would be misled to think that this is a minor bug fix for a non-existent minor release. The version numbering scheme correlating to functional changes is more valuable than any (IMO insubstantial) benefit of identifying the change in license version. One way to view it: the license is a feature. Therefore changing the license is changing a feature. Therefore what was going to be 4.2.2 should become 4.3.0. Nick
Re: Rant about ChangeLog entries and commit messages
On Sun, 2 Dec 2007, Andreas Schwab wrote: | 2007-11-30 Jan Hubicka <[EMAIL PROTECTED]> | | * ggc-common.c (dump_ggc_loc_statistics): Reset ggc_force_collect | flag. How could a newcomer guess why the gcc_force_collect flag needs to be reset? That is supposed to be written in a comment. Indeed. Some advice I once wrote: Often I see a commit with a log message that lovingly explains a small change made to fix a subtle problem, but adds no comments to the code. Don't do this! Put that careful description in a comment, where people can actually see it. (Commit logs are basically invisible; even if they are auto-emailed to all developers, they are soon forgotten, and they don't benefit people not on the email list.) That comment is not a blemish but an invaluable record of an unusual case that someone didn't anticipate. If the bug-fix was pre-empted by a lengthy email exchange, include some or all of that exchange if it helps. Nick
Re: Rant about ChangeLog entries and commit messages
On Mon, 3 Dec 2007, Andi Kleen wrote: Commit logs are basically invisible; That's just a (fixable) problem in your coding setup. In other projects it is very common to use tools like cvs annotate / cvsps / git blame / git log / etc. to find the reasons for why code is the way it is. In fact in several editors these can be functions on hot keys. Programming is hard enough as is without ignoring such valuable information sources. Don't do it. I didn't say you cannot or should not use these tools. But a good comment on a piece of code sure beats a good commit message, which must be looked at separately, and can be fragmented over multiple commits, etc. Nick
Re: [RFC] WHOPR - A whole program optimizer framework for GCC
On Wed, 12 Dec 2007, J.C. Pizarro wrote: [...] * "executable" means "it's from an execution to death of the e-prisoner"? * "Indirect call promotion" means "this promotion indirectly e?"? * "Dead variable elimination" means "elimination variable of R.I.P.s"? * etc. J.C.Pizarro i though that the Apocalypsis is near. My theory is that J.C.Pizarro is an advanced AI chat-bot designed to produce streams of nearly-intelligible programming-related verbiage, and that last email was the result of a malfunction that caused it to dump part of its internal word association database. Nick