Re: which opt. flags go where? - references
Hi, maybe http://docs.lib.purdue.edu/ecetr/123/ would also be interesting for you. There, a quadratic algorithm for finding a nearly optimal set of compiler flags is described. The results are quite promising and i have also tested it on my own benchmarkingsuite with good results. cu, Ronny Peine pgpFVOsQLoKuf.pgp Description: PGP signature
Re: which opt. flags go where? - references
Hi, Am Donnerstag, 8. Februar 2007 13:18 schrieben Sie: > Thank you very much. After reading the abstract, I'm highly > interested in this work, because they also use GCC and SPEC CPU2000, > as I'm planning to do... > > Which benchmarks did you test on? I testet it on freebench-1.03, nbench-byte-2.2.2 and a selfmade lame-benchmark (encoding a wav to mp3). I don't have SPEC CPU because it's not for free. The runtimes are about one day for each benchmark if tested with nearly all of gcc's possible CFLAGS for optimization (tested with 3.4.6 and 4.1.1). So the quadratic nature of the algorithm can be quite painful, but it gives better results than the linear approach. cu, Ronny Peine pgpVqQpBRZlaN.pgp Description: PGP signature
Re: __builtin_cpow((0,0),(0,0))
Well, i'm studying mathematics and as i know so far 0^0 is always 1 (for real and complex numbers) and well defined even in numerical and theoretical mathematics. Could you point me to some publications which say other things? cu, Ronny Duncan Sands wrote: On Mon, 2005-03-07 at 10:51 -0500, Robert Dewar wrote: Paolo Carlini wrote: Andrew Haley wrote: F9.4.4 requires pow (x, 0) to return 1 for any x, even NaN. Indeed. My point, basically, is that consistency appear to require the very same behavior for *complex* zero^zero. I am not sure, it looks like the standard is deliberately vague here, and is not requiring this result. Mathematically speaking zero^zero is undefined, so it should be NaN. This already clear for real numbers: consider x^0 where x decreases to zero. This is always 1, so you could deduce that 0^0 should be 1. However, consider 0^x where x decreases to zero. This is always 0, so you could deduce that 0^0 should be 0. In fact the limit of x^y where x and y decrease to 0 does not exist, even if you exclude the degenerate cases where x=0 or y=0. This is why there is no reasonable mathematical value for 0^0. Ciao, Duncan.
Re: __builtin_cpow((0,0),(0,0))
Hi again, a small proof. if A and X are real numbers and A>0 then A^X := exp(X*ln(A)) (Definition in analytical mathematics). 0^0 = lim A->0, A>0 (exp(0*ln(A)) = 1 if exp(X*ln(A)) is continual continued The complex case can be derived from this (0^(0+ib) = 0^0*0^ib = 1 = 0^a*0^(i*0) ). Well, i know only the german mathematical expressions, so maybe the translations to english are not accurate, sorry for this :) cu, Ronny
Re: __builtin_cpow((0,0),(0,0))
Hi, Marcin Dalecki wrote: On 2005-03-08, at 01:47, Ronny Peine wrote: Hi again, a small proof. How cute. if A and X are real numbers and A>0 then A^X := exp(X*ln(A)) (Definition in analytical mathematics). 0^0 = lim A->0, A>0 (exp(0*ln(A)) = 1 if exp(X*ln(A)) is continual continued The complex case can be derived from this (0^(0+ib) = 0^0*0^ib = 1 = 0^a*0^(i*0) ). Well, i know only the german mathematical expressions, so maybe the translations to english are not accurate, sorry for this :) You managed to hide the proof very well. I can't find it. I don't think it's hidden. The former definiton is absolutely right.
Re: __builtin_cpow((0,0),(0,0))
Joe Buck wrote: On Tue, Mar 08, 2005 at 01:47:13AM +0100, Ronny Peine wrote: Hi again, a small proof. if A and X are real numbers and A>0 then A^X := exp(X*ln(A)) (Definition in analytical mathematics). That is an incomplete definition, as 0^X is well-defined. 0^0 = lim A->0, A>0 (exp(0*ln(A)) = 1 if exp(X*ln(A)) is continual continued Your proof is wrong; since you even propose it you probably have not been exposed to partial differential equations. You have a two-dimensional plane; you can approach the origin from any direction. The direction you chose was to keep the exponent constant at 0. Then you get a limit of 1. An alternate choice is to keep the base constant at 0, choose a positive exponent and let it approach zero. Then you get a limit of 0. Well, then it would be lim x->0 (0^x) = 1 because 0^x is 1 for every x element of |R_>0
Re: __builtin_cpow((0,0),(0,0))
Ronny Peine wrote: Joe Buck wrote: On Tue, Mar 08, 2005 at 01:47:13AM +0100, Ronny Peine wrote: Hi again, a small proof. if A and X are real numbers and A>0 then A^X := exp(X*ln(A)) (Definition in analytical mathematics). That is an incomplete definition, as 0^X is well-defined. 0^0 = lim A->0, A>0 (exp(0*ln(A)) = 1 if exp(X*ln(A)) is continual continued Your proof is wrong; since you even propose it you probably have not been exposed to partial differential equations. You have a two-dimensional plane; you can approach the origin from any direction. The direction you chose was to keep the exponent constant at 0. Then you get a limit of 1. An alternate choice is to keep the base constant at 0, choose a positive exponent and let it approach zero. Then you get a limit of 0. Well, then it would be lim x->0 (0^x) = 1 because 0^x is 1 for every x element of |R_>0 Sorry for this, maybe i should sleep :) (It's 2 o'clock here) But as i know of 0^0 is defined as 1 in every lecture i had so far.
Re: __builtin_cpow((0,0),(0,0))
Well, these were math lectures (Analysis 1,2 and 3, Function Theory, Numerical Mathematics and so on). In every lectures it was defined as 1 and in most cases mathematical expressions are mostly tried to transform in equivalent calculations for the FPU (even though associativity is for example not preserved). I don't know of any standard which defines this to 0. Robert Dewar wrote: Ronny Peine wrote: Sorry for this, maybe i should sleep :) (It's 2 o'clock here) But as i know of 0^0 is defined as 1 in every lecture i had so far. Were these math classes, or CS classes. Generally when you have a situation like this, where the value of the function is different depending on how you approach the limit, you prefer to simply say that the function is undefined at that point. As we have discussed, computers, which are not doing real arithmetic in any case, often extend domains for convenience, as in this case.
Re: __builtin_cpow((0,0),(0,0))
Hi again, a small example often used in mathematics and electronic engineering: the geometric row ("Reihe" in german, i don't know the correct expression in english): sum from k=0 to +unlimited q^k = 1/(1-q) if |q|<1. this is also correct for q=0 where the sum gives q^0+q^1+q^2+...= 1 + 0 + 0 + ... (if 0^0 = 1) and 1/(1-q) = 1 too. I have read some parts in ieee 754 and articles about this but they say that 0^0 is very questionable, some say it's 0 some say it's 1. Well, after the standard there doesn't seem to be an accurate answer, in mathematics it's 1 (and will always be 1). cu, Ronny Ronny Peine wrote: Well, these were math lectures (Analysis 1,2 and 3, Function Theory, Numerical Mathematics and so on). In every lectures it was defined as 1 and in most cases mathematical expressions are mostly tried to transform in equivalent calculations for the FPU (even though associativity is for example not preserved). I don't know of any standard which defines this to 0. Robert Dewar wrote: Ronny Peine wrote: Sorry for this, maybe i should sleep :) (It's 2 o'clock here) But as i know of 0^0 is defined as 1 in every lecture i had so far. Were these math classes, or CS classes. Generally when you have a situation like this, where the value of the function is different depending on how you approach the limit, you prefer to simply say that the function is undefined at that point. As we have discussed, computers, which are not doing real arithmetic in any case, often extend domains for convenience, as in this case.
Re: __builtin_cpow((0,0),(0,0))
Maybe i found something: http://www.cs.berkeley.edu/~wkahan/ieee754status/ieee754.ps page 9 says: "A number of real expressions are sometimes implemented as INVALID by mistake, or declared Undefined by illconsidered language standards; a few examples are ... 0.0**0.0 = inf**0.0 = NaN**0.0 = 1.0, not Nan;" I'm not really sure if he means that it should be 1.0 or it should be NaN but i think he means 1.0. Ronny Peine wrote: Hi again, a small example often used in mathematics and electronic engineering: the geometric row ("Reihe" in german, i don't know the correct expression in english): sum from k=0 to +unlimited q^k = 1/(1-q) if |q|<1. this is also correct for q=0 where the sum gives q^0+q^1+q^2+...= 1 + 0 + 0 + ... (if 0^0 = 1) and 1/(1-q) = 1 too. I have read some parts in ieee 754 and articles about this but they say that 0^0 is very questionable, some say it's 0 some say it's 1. Well, after the standard there doesn't seem to be an accurate answer, in mathematics it's 1 (and will always be 1). cu, Ronny Ronny Peine wrote: Well, these were math lectures (Analysis 1,2 and 3, Function Theory, Numerical Mathematics and so on). In every lectures it was defined as 1 and in most cases mathematical expressions are mostly tried to transform in equivalent calculations for the FPU (even though associativity is for example not preserved). I don't know of any standard which defines this to 0. Robert Dewar wrote: Ronny Peine wrote: Sorry for this, maybe i should sleep :) (It's 2 o'clock here) But as i know of 0^0 is defined as 1 in every lecture i had so far. Were these math classes, or CS classes. Generally when you have a situation like this, where the value of the function is different depending on how you approach the limit, you prefer to simply say that the function is undefined at that point. As we have discussed, computers, which are not doing real arithmetic in any case, often extend domains for convenience, as in this case.
Re: __builtin_cpow((0,0),(0,0))
Well this article was referenced by http://grouper.ieee.org/groups/754/, so i don't think it's an unreliable source. It would be nice if you wouldn't try to insult me Joe Buck, that's not very productive. Robert Dewar wrote: Marcin Dalecki wrote: Are we a bit too obedient today? Look I was talking about the paper presented above not about the author there of. But a paper like this must be read in context, and if you don't know who the author is, you a) don't have the context to read the paper b) you show yourself to be remarkably ignorant about the field
Re: [OT] __builtin_cpow((0,0),(0,0))
Well, you are right, this discussion becomes a bit off topic. I think 0^0 should be 1 in the complex case, too. Otherwise the complex and real definitions would collide. Example: use complex number 0+i*0 then this should be handled equivalent to the real number 0. Otherwise the programmer would get quite irritated if he transforms real numbers into equivalent complex numbers (a -> a+i*0). Paolo Carlini wrote: Chris Jefferson wrote: What we are debating here isn't really maths at all, just the definition which will be most useful and least suprising (and perhaps also what various standards tell us to use). Also, since we are definitely striving to consistently implement the current C99 and C++ Standards, it's *totally* pointless discussing 0^0 in the real domain: it *must* be one. Please, people, don't overflow the gcc development list with this kind of discussion. I feel guilty because of that, by the way: please, accept my apologies. My original question was *only* about consistency between the real case (pow) and the complex case (cpow, __builtin_cpow, std::complex::pow). Paolo.
Re: __builtin_cpow((0,0),(0,0))
Maybe i should make it more clearer, why 0^x is not defined for real exponents x, and not continual in any way. Be G a set ("Menge" in german) and op : G x G -> G, (a,b) -> a op b. If op is associative than (G,op) is called a half-group. Therefore then exponentiation is defined as: a from G, n from |N>0: a^1 = a; a^n = a op a^(n-1) If a neutral element is in G (mostly called the "1") than a^0 is defined as 1. Example (Z,+) is a half-group (it's even a group). Therefor a^n = a + a + a + ... + a (n times). For real exponents this is not defined in the above case, therefore (Example: what would be 2^pi?) a definition which is in accordance to the previous one was defined: For A,X from |R, A>0: A^X = exp(X*ln(A)) with exp(N*X) = exp(X)^N (which can be proofed by induction) it can be seen that it is in accordance to the previous definition (if X is from |N). The rule a^(1/n) = n-th root of a comes from the proof: Be a from |R, a>0 and p from Z, q from |N>1, then: a^p = exp(p * ln(a)) = exp(q * (p/q) * ln(a)) = exp(p/q * ln(a))^q = (a^(p/q))^q => a^(p/q) = q-th root of a^p (remind that this is only true for a>0). For 0^x there is no such definition except of x is from |N. Therefore 0^0 is defined as according to the first rule as 1 (because we look at the group (|R,*) with a^n= a*a*a* ... *a (n times) and the neutral element 1, therefore a^0 = 1 for every element in |R). I hope that this make things clearer for some who don't believe 0^0 = 1 in the real case. cu, Ronny Robert Dewar wrote: Ronny Peine wrote: Well this article was referenced by http://grouper.ieee.org/groups/754/, so i don't think it's an unreliable source. Since Kahan is one of the primary movers behind 754 that's not so surprising. For me, 754 is authoritative significantly because of this connection. If there were a case where Kahan disagreed with 754, I would suspect that the standard had made a mistake :-)
Re: __builtin_cpow((0,0),(0,0))
This proof is absolutely correct and in no way bogus, it is lectured to nearly every mathematics student PERIOD But you are right, if the standards handles this otherwise, then this doesn't help in any case. Robert Dewar wrote: Ronny Peine wrote: I hope that this make things clearer for some who don't believe 0^0 = 1 in the real case. Believe??? so now its a matter of religeon. Anyway, your bogus proof is irrelevant for the real case, since the language standard is clear in the real case anyway. It really is completely pointless to argue this from a mathematical point of view, the only proper viewpoint is that of the standard. You would do better to go read that!
Re: __builtin_cpow((0,0),(0,0))
Hi, Kai Henningsen wrote: [EMAIL PROTECTED] (Robert Dewar) wrote on 07.03.05 in <[EMAIL PROTECTED]>: Ronny Peine wrote: Sorry for this, maybe i should sleep :) (It's 2 o'clock here) But as i know of 0^0 is defined as 1 in every lecture i had so far. Were these math classes, or CS classes. Let's just say that this didn't happen in any of the German math classes I ever took, school or uni. This is in fact a classic example of this type of behaviour. Generally when you have a situation like this, where the value of the function is different depending on how you approach the limit, you prefer to simply say that the function is undefined at that point. And that's how it was always taught to me. Well yes, in the general case this is the right way. But for some special cases a definition is used to simplify mathematical sentences as it is done for 0^0 = 1 or gcd(0,0,...,0) = 0. See for example: http://mathworld.wolfram.com/ExponentLaws.html Even though, gcc returns 1 for pow(0.0,0.0) in version 3.4.3 like many other c-compiler do. The same behaviour would be expected from cpow. This is, of course, a different question from what a library should implement ... though I must say if I were interested in NaNs at all for a given problem, I'd be disappointed by any such library that didn't return a NaN for 0^0, and of any language standard saying so - I'd certainly consider a result of 1 wrong in the general case. MfG Kai cu, Ronny
Re: __builtin_cpow((0,0),(0,0))
Dave Korn wrote: Original Message From: Ronny Peine Sent: 16 March 2005 17:34 See for example: http://mathworld.wolfram.com/ExponentLaws.html Ok, I did. Even though, gcc returns 1 for pow(0.0,0.0) in version 3.4.3 like many other c-compiler do. The same behaviour would be expected from cpow. No, you're wrong (that the same behaviour would be expected from cpow). See for example: http://mathworld.wolfram.com/ExponentLaws.html " Note that these rules apply in general only to real quantities, and can give manifestly wrong results if they are blindly applied to complex quantities. " Well yes in the general case it's not applieable, but x^0 is 1 in the complex case, too. And if 0^0 is converted from the real to the complex domain (it's even a part of the complex domain) than the same behaviour would be expected, otherwise the definition wouldn't be very well. Has anyone found a hint in the ieee754 standard if there is something about it in there? I haven't one here right now, well it's not prizeless. Otherwise these discussion won't end. cheers, DaveK cu, Ronny
Performance comparison of gcc releases
Hi, i thought, maybe it interests you. I have done some benchmarking to compare different gcc releases. Therefore, i have written a bash based benchmarksuite which compiles and benches mostly C-benchmarks. Benchmark environment: The benchmarks are running on i686-pc-linux-gnu (gentoo based) system on a Athlon XP 2600+ Barton Core (512 Kbyte L2 Cache) with 1.9Ghz and 512 MB Ram. The benchmarks are all running with nice -n -20 for minimizing noise. Therefore, only very few processes are running while benchmarking (only kernel, agetty, bash, udev, login and syslog). The benchmarks: I'm using freebench-1.03, nbench-byte-2.2.2 and a selfmade lamebench-3.96.1 based on lame-3.96.1. In the lamebenchmark a wav-file is compressed to an mp3 file, the time is measured for this. The benchmark procedure: A bash script named startbenchrotation starts the given benchmarks with the following commandlines: startfreebench-1.03() { make distclean >/dev/null 2>&1 nice -n -20 make ref >/dev/null 2>&1 } startnbench-byte-2.2.2() { make mrproper >/dev/null 2>&1 make >/dev/null 2>&1 nice -n -20 ./nbench 2>/dev/null > ./resultfile.txt } startlamebench-3.96.1() { rm -rf lamebuild rm -f testfile.mp3 mkdir lamebuild || error "Couldn't mkdir lamebench" cd lamebuild ../lame-3.96.1/configure >/dev/null 2>&1 make >/dev/null 2>&1 START=`date +%s` nice -n -20 frontend/lame -m s --quiet -q 0 -b 128 --cbr ../testfile.wav ../testfile.mp3 END=`date +%s` cd .. echo "$((${END}-${START}))" >./resultfile.txt } Each benchmark is run with a combination of cflags. The cflags are composed of base-flags and testingflags, eg.: BASEFLAGS="-s -static -O3 -march=athlon-xp -fomit-frame-pointer -pipe" TESTINGFLAGS="-fforce-addr|-fsched-spec-load|-fmove-all-movables|-freduce-all-givs|-ffast-math|-ftrace r|-funroll-loops|-funroll-all-loops|-fprefetch-loop-arrays|-mfpmath=sse|-mfpmath=sse,387|-momit-leaf-frame-poi nter" '|' is used as a field-seperator. First, all flags from baseflags and testingflags are combined and the benchmark is started and the result is taken as the best result. Then, a flag from the testingflags is removed and the benchmark is repeated. If the arithmetic average of the results in the repeated benchmark is better than the arithmetic average of the best results so far, than the tested cflag is noted as worst flag. This is done for all flags in the testingflags and the worst flag of all is filtered out of the testingflags. After this, the above procedure is started again with the new testingflags without the filtered one. (This heuristical approach to compiler performance comparison was described on the gcc ml some month ago, "Compiler Optimization Orchestration For Peak Performance" by Zhelong Pan and Rudolf Eigenmann). The protagonists: The tested compilers were: gcc-3.3.6 (gentoo system compiler), gcc-3.4.4 (gentoo system compiler) and gcc-4.0.2 (FSF release). The results: All results are written as relativ performance measures. gcc-3.3.6 is taken as the basis for all relations. If x% > 0% then this means, that the given compiler generated code which was x% faster than the code from gcc-3.3.6. For x% <= 0% it means the generated code was slower. All relations base on the arithmetic average of all passes of a benchmark of the best achieved result. benchmark: gcc-3.3.6 gcc-3.4.4 gcc-4.0.2 freebench-+1% -5% nbench-+13% +11% lamebench - +1% +1% Conclusion: Well, this benchmarksuite is only meant for some comparisons between different compilers to estimate performance in real life applications. If you are interested in future benchmarks of newer releases, i could offer this service. If you think this is uninteresting, i won't send any benchmark measures anymore. I hope this will help track the performance of code generated by gcc and help gcc getting better in this afford. Constructiv critics is always welcomed. I hope you guys keep up your work on improving gcc. Thanks for reading, Ronny Peine
Performance comparison of gcc releases
Hi, i forgot to post the best cflags for each gcc-version and benchmark. Here are the results: gcc-3.3.6: nbench: -s -static -O3 -march=athlon-xp -fomit-frame-pointer -pipe -fforce-addr -fsched-spec-load -fmove-all-movables -ffast-math -ftracer -funroll-loops -funroll-all-loops -mfpmath=sse -momit-leaf-frame-pointer freebench: -s -static -O3 -march=athlon-xp -fomit-frame-pointer -pipe -fforce-addr -fsched-spec-load -fmove-all-movables -freduce-all-givs -ftracer -funroll-all-loops -fprefetch-loop-arrays -mfpmath=sse -momit-leaf-frame-pointer lamebench: -s -static -O3 -march=athlon-xp -fomit-frame-pointer -pipe -fforce-addr -fsched-spec-load -fmove-all-movables -freduce-all-givs -funroll-loops -funroll-all-loops -mfpmath=sse -mfpmath=sse,387 -momit-leaf-frame-pointer gcc-3.4.4: nbench: -s -static -O3 -march=athlon-xp -fomit-frame-pointer -pipe -fforce-addr -fsched-spec-load -fsched2-use-superblocks -fsched2-use-superblocks -fsched2-use-traces -fmove-all-movables -ffast-math -funroll-loops -funroll-all-loops -fpeel-loops -fold-unroll-loops -fbranch-target-load-optimize2 -mfpmath=sse -mfpmath=sse,387 freebench: -s -static -O3 -march=athlon-xp -fomit-frame-pointer -pipe -fforce-addr -fsched-spec-load -fsched2-use-superblocks -fsched2-use-superblocks -fsched2-use-traces -freduce-all-givs -ffast-math -ftracer -funroll-loops -funroll-all-loops -fpeel-loops -fold-unroll-loops -fold-unroll-all-loops -fbranch-target-load-optimize -fbranch-target-load-optimize2 -mfpmath=sse -mfpmath=sse,387 -momit-leaf-frame-pointer lamebench: -s -static -O3 -march=athlon-xp -fomit-frame-pointer -pipe -fforce-addr -fsched-spec-load -fsched2-use-superblocks -fsched2-use-superblocks -fsched2-use-traces -fmove-all-movables -freduce-all-givs -ftracer -funroll-loops -funroll-all-loops -fpeel-loops -fold-unroll-loops -fold-unroll-all-loops -fbranch-target-load-optimize -fbranch-target-load-optimize2 -mfpmath=sse -mfpmath=sse,387 -momit-leaf-frame-pointer gcc-4.0.2: nbench: -s -static -O3 -march=athlon-xp -fomit-frame-pointer -pipe -fforce-addr -fmodulo-sched -fgcse-sm -fgcse-las -fsched-spec-load -ftree-vectorize -ftracer -funroll-loops -fvariable-expansion-in-unroller -fprefetch-loop-arrays -freorder-blocks-and-partition -fweb -ffast-math -fmove-loop-invariants -fbranch-target-load-optimize -fbranch-target-load-optimize2 -fbtr-bb-exclusive -momit-leaf-frame-pointer -D__NO_MATH_INLINES freebench: -s -static -O3 -march=athlon-xp -fomit-frame-pointer -pipe -fmodulo-sched -fsched-spec-load -freschedule-modulo-scheduled-loops -ftree-vectorize -ftracer -funroll-loops -fvariable-expansion-in-unroller -fprefetch-loop-arrays -freorder-blocks-and-partition -fmove-loop-invariants -fbranch-target-load-optimize -fbranch-target-load-optimize2 -fbtr-bb-exclusive -momit-leaf-frame-pointer -D__NO_MATH_INLINES lamebench: -s -static -O3 -march=athlon-xp -fomit-frame-pointer -pipe -fgcse-sm -fgcse-las -fsched-spec-load -fsched2-use-superblocks -fsched2-use-traces -freschedule-modulo-scheduled-loops -ftracer -funroll-loops -fvariable-expansion-in-unroller -freorder-blocks-and-partition -fweb -ffast-math -fpeel-loops -fmove-loop-invariants -fbranch-target-load-optimize -fbranch-target-load-optimize2 -fbtr-bb-exclusive -mfpmath=sse -mfpmath=sse,387 -momit-leaf-frame-pointer -D__NO_MATH_INLINES The time for one benchmark and one compiler takes from 6 to 48 hours and depends heavily on the given testingflags (the used algorithm for flagfiltering is O(n^2)). The testingflags for each compiler is: gcc-3.3.6: TESTINGFLAGS="-fforce-addr|-fsched-spec-load|-fmove-all-movables|-freduce-all-givs|-ffast-math| -ftracer|-funroll-loops|-funroll-all-loops|-fprefetch-loop-arrays|-mfpmath=sse|-mfpmath=sse,387| -momit-leaf-frame-pointer" gcc-3.4.4: TESTINGFLAGS="-fforce-addr|-fsched-spec-load|-fsched2-use-superblocks| -fsched2-use-superblocks -fsched2-use-traces|-fmove-all-movables| -freduce-all-givs|-ffast-math|-ftracer|-funroll-loops|-funroll-all-loops| -fpeel-loops|-fold-unroll-loops|-fold-unroll-all-loops|-fprefetch-loop-arrays| -fbranch-target-load-optimize|-fbranch-target-load-optimize2|-mfpmath=sse| -mfpmath=sse,387|-momit-leaf-frame-pointer" gcc-4.0.2: TESTINGFLAGS="-fforce-addr|-fmodulo-sched|-fgcse-sm|-fgcse-las|-fsched-spec-load| -fsched2-use-superblocks -fsched2-use-traces| -freschedule-modulo-scheduled-loops| -ftree-vectorize| -ftracer|-funroll-loops|-fvariable-expansion-in-unroller| -fprefetch-loop-arrays|-freorder-blocks-and-partition|-fweb|-ffast-math|-fpeel-loops| -fmove-loop-invariants|-fbranch-target-load-optimize|-fbranch-target-load-optimize2| -fbtr-bb-exclusive|-mfpmath=sse|-mfpmath=sse,387|-momit-leaf-frame-pointer|-D__NO_MATH_INLINES" -ftree-loop-linear is removed from the testingflags in gcc-4.0.2 because it leads to an endless loop in neural net in nbench.
Re: Performance comparison of gcc releases
Hi, Am Freitag, 16. Dezember 2005 19:50 schrieb Sebastian Pop: > Ronny Peine wrote: > > -ftree-loop-linear is removed from the testingflags in gcc-4.0.2 because > > it leads to an endless loop in neural net in nbench. > > Could you fill a bug report for this one? Done. cu, Ronny Peine
Re: Performance comparison of gcc releases
Hi, Am Freitag, 16. Dezember 2005 19:31 schrieb Dan Kegel: > Your PR is a bit short on details. For instance, it'd be nice to > include a link to the source for nbench, so people don't have > to guess what version you're using. Was it > http://www.tux.org/~mayer/linux/nbench-byte-2.2.2.tar.gz > ? > > It'd be even more helpful if you included a recipe a sleepy person > could use to reproduce the problem. In this case, > something like > > wget http://www.tux.org/~mayer/linux/nbench-byte-2.2.2.tar.gz > tar -xzvf nbench-byte-2.2.2.tar.gz > cd nbench-byte-2.2.2 > make CC=gcc-4.0.1 CFLAGS="-ftree-loop-linear" > > Unfortunately, I couldn't reproduce your problem with that command. > Can you give me any tips? > > Finally, it's helpful when replying to the list about filing a PR > to include the PR number or a link to the PR. > The shortest link is just gcc.gnu.org/PR%d, e.g. >http://gcc.gnu.org/PR25449 Sorry, i had forgotten to give the information. It was nbench-2.2.2. a 'make CC=gcc-4.0.2 CFLAGS="-O3 -march=... -ftree-loop-linear"' should be enough. The bugreport is a duplicate of 20256, as i have written into bugzilla. The source extracted in 20256 is nearly the same as the 'neural net' benchmark. The next time i write a bugreport, i should more concentrate on it, sorry again for this. cu, Ronny Peine
Christmas
Hi all, i'm going into holiday and i wish you all of the gcc-team a happy christmas and thanks for all your work, even though it is still to early for christmas wishes :). cu, Ronny Peine
Re: Very Fast: Directly Coded Lexical Analyzer
Hi, my questions is, why not use the element construction algorithm? The Thomson Algorithm creates an epsilon-NFA which needs quite a lot of memory. The element construction creates an NFA directly and therefor has fewer states. Well, this is only interesting in the scanner creation which is not so important than the scanner itself, but it can reduce the memory footprint of generator. It's a pity i can't find a url for the algorithmdescription, maybe i even have the wrong naming of it. I have only read it in script Compiler Construction at the University. cu, Ronny pgppPnWr9BDsO.pgp Description: PGP signature
Re: Very Fast: Directly Coded Lexical Analyzer
Am Freitag, 10. August 2007 schrieben Sie: > To me, very fast (millions of lines a second) lexical analyzers are > trivial to write by hand, and I really don't see the point of tools, > and certainly not the utility of any theory in writing such code. > If anything the formalism of a finite state machine just gets in the > way, since it is more efficient to encode the state in the code > location than in data. Well, there are people out there who don't want to write everytime the same code. Why not making your life easier with using autogeneration tools, it also reduces bugpropability. signature.asc Description: This is a digitally signed message part.