Re: Threading the compiler

2006-11-10 Thread Nicholas Nethercote

On Fri, 10 Nov 2006, Mike Stump wrote:


On Nov 10, 2006, at 12:46 PM, H. J. Lu wrote:

Will use C++ help or hurt compiler parallelism? Does it really matter?


I'm not an expert, but, in the simple world I want, I want it to not matter 
in the least.  For the people writing most code in the compiler, I want clear 
simple rules for them to follow.


Good luck.

As I understand it, GCC's old manual memory manager was dumped in favour of
garbage collection because it was too error-prone.  I suspect there will be
similar correctness issues with threads -- they're hard to get right, and
GCC is big, complex, and worked on by lots of people.

For example, google uses mapreduce http://labs.google.com/papers/ 
mapreduce.html as a primitive, and there are a few experts that manage that 
code, and everyone else just mindlessly uses it.  The rules are explained to 
them, and they just follow the rules and it just works.  No locking, no 
atomic, no volatile, no cleaver lock free code, no algorithmic changes (other 
than decomposing into isolated composable parts) .  I'd like something 
similar for us.


I'd be really impressed if someone could work out such a system that works 
for a program such as GCC.


Nick


Re: Level to do such a modification...

2007-01-23 Thread Nicholas Nethercote

On Wed, 24 Jan 2007, [GB2312] ÎâêØ wrote:


I am working on gcc 4.0.0. I want to use gcc to intercept each call to
read, and taint the data readed in. For example:
transform
read(fd, buf, size)
to
read(fd, buf, size)
if(is_socket(fd))
taint(buf, size)
So, what is the best suitable level to do this modification in gcc? My
own thought is in finish_function, before calling c_genericize,as I
discovered that in c front-end, there's no GENERIC tree... In
c_genericize, it directly calls gimplify_function_tree.


Are you sure you want to do this in GCC?  You might find it easier to use a 
dynamic binary instrumentation framework such as Valgrind or Pin to do this 
kind of thing.


Nick

Re: Level to do such a modification...

2007-01-23 Thread Nicholas Nethercote

On Wed, 24 Jan 2007, [GB2312] ÎâêØ wrote:


I know valgrind, it is an emulator ,but we are restricted not to use
an emulator. :-(


Well, for some definition of "emulator".

Nick

Re: We're out of tree codes; now what?

2007-03-19 Thread Nicholas Nethercote

On Mon, 19 Mar 2007, Doug Gregor wrote:


> It's going to have a big performance impact. To extract a 9-bit value,
> the compiler will need to do a lot of masking every time it accesses
> the TREE_CODE.

So, about 16% slower with --enable-checking, 5% slower with 
--disable-checking.


Subcodes might still be the way to go, but I'm feeling less bad about
the 9-bit tree code option.


As an interested outsider:  GCC's compile-time speed has been gradually 
decreasing for a while now.  It seems to be acknowledged as an undesirable 
thing, but not much has happened to change it.  AIUI, this is largely 
because it's very difficult.  Nonetheless, seeing a 5% slow-down caused by 
fixing a data structure design bogon is disappointing.


Nick


Re: We're out of tree codes; now what?

2007-03-19 Thread Nicholas Nethercote

On Mon, 19 Mar 2007, Doug Gregor wrote:


But what is the solution? We can complain about performance all we
want (and we all love to do this), but without a plan to fix it we're
just wasting effort. Shall we reject every patch that causes a slow
down? Hold up releases if they are slower than their predecessors?
Stop work on extensions, optimizations, and bug fixes until we get our
compile-time performance back to some predetermined level?

We have hit a hard limit in the design of GCC. We need to either use
more memory, use more compilation time, re-architect non-trivial
portions of GCC, remove functionality, or come up with something very,
very clever. Pick one, but if the pick the last one, you have to
specify what "something very, very clever" is, because we seem to be
running short on ideas.


GCC is a very ambitious compiler:

- it supports a lot of platforms
- it supports a lot of languages

However, most users do not use most of those combinations.  The problem is 
that supporting all these combinations hurts the specific combinations.


For example, I almost always compile C, and usually on x86 or AMD64.  So I 
don't care if the C++ compiler has improved, or a new language front-end is 
added, or whatever.  (Other users will have different priorities.)


As for what is best to do, I don't know.  But I do know that complexity is 
bad, and that GCC is very complex.  You are absolutely right about there 
being hard limits.  There are trade-offs required.  Whether the current and 
ongoing trade-offs are the right ones is an open question.


Nick


GCC priorities [Was Re: We're out of tree codes; now what?]

2007-03-20 Thread Nicholas Nethercote

On Tue, 20 Mar 2007, Nicholas Nethercote wrote:


GCC is a very ambitious compiler:

- it supports a lot of platforms
- it supports a lot of languages

However, most users do not use most of those combinations.  The problem is 
that supporting all these combinations hurts the specific combinations.


Nobody responded to this particular point, which surprised me.  I looked up 
the GCC mission statement (http://gcc.gnu.org/gccmission.html).  It has the 
following "Design and Development Goals"


* New languages
* New optimizations
* New targets
* Improved runtime libraries
* Faster debug cycle
* Various other infrastructure improvements

I think they're terrible:

- "New languages" -- why?  Just because you can?  In theory, adding a new
  language can be free, but in practice it never is.

- "New optimizations?"  I assume this means "optimization passes".
  Users don't care about optimization passes.  Users care
  about performance.  Optimizations happen to be the main way to achieve
  that, but substituting the mechanism for the end goal in a mission
  statement sends the wrong message.

- "New targets" -- this one is better, but some qualification would help.
  I understand that the goal is for GCC to be the universally available
  compiler, but it could be clarified that it's targets that are desired by
  users.

- They're vague -- "improved runtime libraries"?  "various other
  infrastructure improvements"?  These phrases are meaningless.  Why not
  just write "a really awesome compiler"?

- There's no notion that you can't have everything.  For something to be a
  high priority, you have to make something else a lower priority.  This
  list is just "faster! better! more!"  In particular, to return to the
  original point of this thread, the "faster debug cycle" has been suffering
  horribly (due to compile-time performance regressions).

- They haven't been updated since 1999-04-22.

Here are some suggestions for more suitable priorities.  They're all 
characteristics, rather than features.  They're far from perfect, but I 
think they're better than the goals above.  I haven't put them in any 
particular order.


- Correctness w.r.t. language definitions (ie. accept correct code, reject
  incorrect code)
- Correctness of the compiler (ie. no compiler crashes)
- Correctness of generated code (ie. compiled code should do the right
  thing)
- Performance of the compiler (time and memory)
- Performance of generated code (time and memory)
- Performance of building the compiler (time and memory)
- Support of existing language extensions
- Addition of new language extensions
- Support of existing languages: C, C++, Objective C, Fortran, Ada, Java
  (and any others I've missed)
- Support of new languages
- Support of existing platforms (a.k.a. portability)
- Support of new platforms (a.k.a. portability)
- Design and code complexity
- Maintainability
- Debugging support (eg. generating correct debugging info)
- Profiling support
- Quality of error messages (eg. uninitialized variables)
- Support for other extensions (eg. mudflap)

You can split these up or qualify them more, eg. for "performance of the 
compiler" you might distinguish between -O0 and -O2.


The key idea is that you can't have all of these.  For example, supporting 
many languages and platforms increases complexity, adds more code (which 
slows down build times), and can hurt performance (as the tree codes example 
has shown).  It also takes resources that then are not used for improving 
other aspects.  Another example: there was a suggested SoC project yesterday 
for an incremental C++ parser.  That could speed up compile-time performance 
in some cases, but at the cost of increased design and code complexity.


This idea is simple, and I'm sure many of you understand it individually. 
But it appears to me, as an observer of GCC development, that GCC developers 
as a group don't understand this.



On Mon, 19 Mar 2007, Doug Gregor wrote:


We have hit a hard limit in the design of GCC. We need to either use
more memory, use more compilation time, re-architect non-trivial
portions of GCC, remove functionality, or come up with something very,
very clever. Pick one, but if the pick the last one, you have to
specify what "something very, very clever" is, because we seem to be
running short on ideas.


Doug was talking about the tree code issue, but this paragraph sums up the 
whole situation perfectly.  Sometimes it's a good idea to stand back and 
look at the bigger picture, rather than just running on the "gotta fix 
another bug, gotta add another feature" treadmill.


Nick


Re: GCC priorities [Was Re: We're out of tree codes; now what?]

2007-03-21 Thread Nicholas Nethercote

On Thu, 21 Mar 2007, Ian Lance Taylor wrote:


I think you may misunderstand the mission statement.  The mission
statement is not a technical roadmap.  It's a statement of general
goals.  If the community has a serious disagreement, the mission
statement can sometimes help clarify matters.
[...]
The problem as I see it is that different gcc developers have
different goals.  And it is a natural tendency for developers to care
more about their specific goals than any developer cares about the
general goals.  The effect is that specific goals get implemented at
the expense of general goals.


Exactly.  I'm viewing the mission statement as the moral equivalent of a 
constitution -- the highest guidelines that you fall back on when everything 
else fails.  Your first paragraph above indicates that you view it 
similarly.  But it's currently so vague that I don't imagine it's much 
use... it's like a legal constitution that says "be nice to everybody".


Many (most?) open source projects have one or a few benevolent dictators 
that control things.  GCC doesn't (the steering committee is too far removed 
from day-to-day decisions), and I think its development suffers for it at 
times.  Maybe I'm wrong, but to return to the original topic, I don't expect 
to see compile-time performance improve significantly in any future release.


Nick


Re: GCC priorities [Was Re: We're out of tree codes; now what?]

2007-03-21 Thread Nicholas Nethercote

On Wed, 21 Mar 2007, Paul Brook wrote:


The problem is that I don't think writing a detailed "mission statement" is
actually going to help anything. It's either going to be gcc contributors
writing down what they're doing anyway, or something invented by the SC or
FSF. I the latter case nothing's going to change because neither the SC nor
the FSF have any practical means of compelling contributors to work on a
particular feature.

It's been said before that Mark (the GCC release manager) has no real power to
make anything actually happen. All he can do is delay the release and hope
things get better.


Then it will continue to be interesting, if painful, to watch.

Nick


Re: We're out of tree codes; now what?

2007-03-24 Thread Nicholas Nethercote


Several alternatives were tried -- the sub-code approach, the 9-bit 
approach, the 16-bit approach.  It might be interesting to try using 
Cachegrind or Callgrind to better understand why the performance changes 
occurred.


Nick


Re: memory checkers and gcc support

2005-03-13 Thread Nicholas Nethercote
On Mon, 14 Mar 2005, J. Hart wrote:
Valgrind is an excellent product as far as it goes, but is x86 only, and
apparently lacks the sort of stack checking that Purify and Checker have.
Valgrind is currently being officially ported to several other platforms.
We hope to have AMD64/Linux and PPC32/Linux ports complete by the middle 
of this year.  It still won't be able to do stack checking, though.

N


Re: gcc cache misses [was: Re: OT: How is memory latency important on AMD64 box while compiling large C/C++ sources]

2005-04-12 Thread Nicholas Nethercote
On Tue, 12 Apr 2005, Karel Gardas wrote:
cachegrind can also be used to estimate the number (though, not sure
how accurate it is, possibly not very).  I use Shark to actually get
the real number.
Perhaps it's possible that cachegrind is wrong or cache misses differ from
platform to platform, but I would tell that I get very good numbers for
gcc running on x86 platform:
In my experience Cachegrind can give pretty good numbers for L1 misses, 
espcially D1, but the L2 misses tend to vary more.  I saw this with 
comparisons against the real numbers reported by the performance counters 
on an Athlon.  However, Cachegrind certainly makes a number of 
approximations (see section 3.3.7 of 
http://www.valgrind.org/docs/phd2004.pdf) and so you shouldn't trust it 
too much.  It should give reasonable numbers though.

N


Unnecessary sign- and zero-extensions in GCC?

2005-04-18 Thread Nicholas Nethercote
Hi,
I've been looking at GCC's use of sign-extensions when dealing with 
integers smaller than a machine word size.  It looks like there is room 
for improvement.

Consider this C function:
short g(short x)
{
   short i;
   for (i = 0; i < 10; i++) {
  x += i;
   }
   return x;
}
On x86, using a GCC 4.0.0 20050130, with -O2 I get this code:
g:
pushl   %ebp
xorl%edx, %edx
movl%esp, %ebp
movswl  8(%ebp),%ecx
.p2align 4,,15
.L2:
leal(%ecx,%edx), %eax
movswl  %ax,%ecx# 1
leal1(%edx), %eax
movzwl  %ax, %eax   # 2
cmpw$10, %ax
movswl  %ax,%edx# 3
jne .L2
popl%ebp
movl%ecx, %eax
ret
.size   g, .-g
.p2align 4,,15
The three extensions (#1, #2, #3) here are unnecessarily conservative. 
This would be better:

g:
pushl   %ebp
xorl%edx, %edx
movl%esp, %ebp
movswl  8(%ebp),%ecx
.p2align 4,,15
.L2:
leal(%ecx,%edx), %ecx   # x += i
leal1(%edx), %edx   # i++
cmpw$10, %dx# i < 10 ?
jne .L2
popl%ebp
movswl  %cx, %eax
ret
GCC's approach seems to be *eager*, in that sign-extensions are done 
immediately after sub-word operations.  This ensures that the high bits of 
a register holding a sub-word value are always valid.

An alternative is to allow the high bits of registers holding sub-word 
values to be "junk", and do sign-extensions *lazily*, only before 
operations in which any "junk" high bits could adversely affect the 
result.  For example, if you do a right shift on a value with "junk" high 
bits you have to sign/zero-extend it first, because high bits in the 
operands can affect low bits in the result.  The same is true of division.

In contrast, an addition of two 16-bit values with "junk" high bits is ok 
if the result is also a 16-bit value.  The same is true of subtraction, 
multiplication and logical ops.  The reason is that for these operations, 
the low 16 bits of the result do not depend on the high 16 bits of the 
operands.

Although you can construct examples where the eager approach gives better 
code, in general I think the lazy approach results in better code, such as 
in the above example.  Is there a particular reason why GCC uses the eager 
approach?  Maybe it has to do with the form of GCC's intermediate 
representation?  Or are there are some subtleties to do with the lazy 
approach that I have overlooked?  Or maybe I've misunderstood GCC's 
approach.

Any comments are appreciated.  Thanks for your help.
Nick


Re: Unnecessary sign- and zero-extensions in GCC?

2005-04-18 Thread Nicholas Nethercote
On Mon, 18 Apr 2005, Steven Bosscher wrote:
I've been looking at GCC's use of sign-extensions when dealing with
integers smaller than a machine word size.  It looks like there is room
for improvement.
Is your problem the same as the one described on one of the Wiki pages,
"http://gcc.gnu.org/wiki/Exploiting Dual Mode Operation"?
I think so, yes.
Nick


Re: Do CO++ signed types have modulo semantics?

2005-06-29 Thread Nicholas Nethercote

On Tue, 28 Jun 2005, Joe Buck wrote:


There is no such assumption.  Rather, we assume that overflow does not
occur about what happens on overflow.  Then, for the case where overflow
does not occur, we get fast code.  For many cases where overflow occurs
with a 32-bit int, our optimized program behaves the same as if we had a
wider int.  In fact, the program will work as if we had 33-bit ints.  Far
from producing a useless result, the optimized program has consistent
behavior over a broader range.  To see this, consider what the program
does with a=MAX_INT, b=MAX_INT-1.  My optimized version, which always
calls blah(b+1), which is what a 33-bit int machine would do.  It does
not trap.


This point about 33-bit machines is interesting because it raises an
optimisation scenario that hasn't been mentioned so far.

Consider doing 32-bit integer arithmetic on 64-bit machines which only
support 64-bit arithmetic instructions.  On such machines you have to use
sign-extensions or zero-extensions after 64-bit operations to ensure
wrap-around semantics (unless you can prove that the operation will not
overflow the bottom 32 bits, or that the value will not be used in a way
that exposes the fact you're using 64-bit arithmetic).

But -- if I have understood correctly -- if the 32-bit values are signed
integers, a C compiler for such a machine could legitimately omit the
sign-extension.  Whereas for unsigned 32-bit values the C standard implies
that you must zero-extend afterwards.  I hadn't realised that.  This has
been an enlightening thread :)

Nick


Re: Do C++ signed types have modulo semantics?

2005-06-29 Thread Nicholas Nethercote

On Wed, 29 Jun 2005, Daniel Berlin wrote:


So i would advise anyone arguing against turning on -fwrapv simply
because it doesn't seem to hurt us at O2.

And i'll again point out that the exact opposite is the default in every
other compiler i'm aware of.


Sorry, I couldn't parse those sentences...  are you saying the -fwrapv 
behaviour (ie. wrap-on-signed-integer-overflow) is the default or not the 
default in these other compilers?



XLC at O2 has qstrict_induction on by default (the equivalent), and
warns the user when it sees a loop where it's making the assumption[1]


Which assumption?


The XLC people told me since they turned this on in 1998, they have had
one real piece of code where it actually mattered, and that was a char
induction variable.

ICC does the same, though i don't think it bothers to warn.

Open64 does the same, but no warning.

Not sure about Sun CC, but i'd be very surprised if they did it.

Personally, i only care about wrapping for induction variables.  If you
guys want to leave regular variables to do whatever, fine.


Are you saying you don't want induction variables to have to wrap, but you 
don't care about non-induction variables?


Sorry if I'm being dim... I think it's excellent you're discussing what 
other compilers do, I just can't understand what you've said as expressed 
:)


Nick


Re: signed is undefined and has been since 1992 (in GCC)

2005-07-02 Thread Nicholas Nethercote

On Sat, 2 Jul 2005, Florian Weimer wrote:


I am puzzled, why would *ANYONE* who knows C use int
rather than unsigned if they want wrap around semantics?


Both OpenSSL and Apache programmers did this, in carefully reviewed
code which was written in response to a security report.  They simply
didn't know that there is a potential problem.  The reason for this
gap in knowledge isn't quite clear to me.


I've done a lot of C programming in the last three years, and for my day 
job I'm working on a C compiler (albeit in parts that are not very C 
specific), and I didn't know that signed overflow is undefined.  Why not? 
I guess I never heard otherwise and I just assumed it would wrap due to 
two's complement arithmetic.  I don't think I've ever written a serious C 
program that required wrap-around on overflow, though.


Nick


Where does the C standard describe overflow of signed integers?

2005-07-11 Thread Nicholas Nethercote

Hi,

There was recently a very long thread about the overflow behaviour of 
signed integers in C.  Apparently this is undefined according to the C 
standard.  I searched the standard on this matter, and while I did find 
some paragraphs that described how unsigned integers must wrap around upon 
overflow, I couldn't find anything explicit about signed integers.  Can 
someone point me to the relevant part(s) of the standard?


Also, does anyone know what the required behaviour for Fortran integers is 
on overflow?


(I realise this isn't exactly on-topic for this list, but I thought it 
reasonable to ask since this topic was discussed so enthusiastically 
recently :)


Thanks very much.

Nick



RE: Where does the C standard describe overflow of signed integers?

2005-07-11 Thread Nicholas Nethercote

On Mon, 11 Jul 2005, Dave Korn wrote:


There was recently a very long thread about the overflow behaviour of
signed integers in C.  Apparently this is undefined according to the C
standard.  I searched the standard on this matter, and while I did find
some paragraphs that described how unsigned integers must wrap around upon
overflow, I couldn't find anything explicit about signed integers.


Dave, Nathan and Paul:  thanks for the quick replies.

The difference between signed and unsigned integer overflow is a little 
unclearly expressed, I think.


3.4.3/3 says:

  "EXAMPLE  An example of undefined behavior is the behavior on integer
   overflow"

6.5/5 says:

  "If an _exceptional condition_ occurs during the evaluation of an
   expression (that is, if the result is not mathematically defined or not
   in the range of representable values for its type), the behavior is
   undefined."

These two paragraphs would seem to indicate that overflow is undefined for 
both signed and unsigned integers.


But then 6.2.5 para 9, sentence 2 says:

  "A computation involving unsigned operands can never overflow, because a
   result that cannot be represented by the resulting unsigned integer
   type is reduced modulo the number that is one greater than the largest
   value that can be represented by the resulting type."

Which requires that unsigned ints must wrap on overflow.  (Actually, I 
guess it defines "overflow" such that unsigned ints never "overflow", so 
3.4.3/3 and 6.5/5 don't apply!)


But I think the paragraphs together are good enough to communicate that: 
unsigned ints must wrap on overflow, signed ints need not.  Thanks again 
for your help.


N



Re: Some notes on the Wiki

2005-07-11 Thread Nicholas Nethercote

On Mon, 11 Jul 2005, Daniel Berlin wrote:

Also, a web-browser is much slower than an info-browser, especially 
when doing searchs.


You must be close to the only user i've met who uses the info browser :)


I use it.  Info pages suck in many ways, but they're fast to load from an 
xterm, fast to search, and even faster when you know where they are in the 
docs (eg. I find myself looking at the GCC C extensions quite often, and I 
can get there very quickly).


Nick


Some tests in gcc.c-torture rely on undefined behaviour?

2005-07-12 Thread Nicholas Nethercote

Hi,

I've been looking at the gcc.c-torture tests, it seems some of them rely 
on undefined behaviour.  For example, 920612-1.c looks like this:


  f(j)int j;{return++j>0;}
  main(){if(f((~0U)>>1))abort();exit(0);}

AIUI, this passes the largest possible positive integer to f(), which then 
increments it, causing signed overflow, the result of which is undefined. 
The test passes if signed overflow wraps.


930529-1.c is similar -- again the maximum positive integer is 
incremented.


20020508-2.c has the following code (abridged):

  #ifndef CHAR_BIT
  #define CHAR_BIT 8
  #endif

  #define ROR(a,b) (((a) >> (b)) | ((a) << ((sizeof (a) * CHAR_BIT) - (b

  #define INT_VALUE ((int)0x1234)

  #define SHIFT1 4

  int i = INT_VALUE;
  int shift1 = SHIFT1;

  if (ROR (i, shift1) != ROR (INT_VALUE, SHIFT1))
abort ();

Similarly, the left-shifting in ROR causes signed integer overflow (I 
think) and so ROR relies on undefined behaviour.  20020508-3.c is similar.


My question is:  what exactly is gcc.c-torture testing?  It seems to be 
testing more than just C standard compliance, but also certain 
undefined-but-desired behaviours, such as wrap-on-signed-overflow (at 
least in some cases).  Is that right?  If so, other C compilers could be 
correct with respect to the C standard but not pass all the tests.  I 
couldn't find any kind of README or description about the tests that 
covered this point, so I'd appreciate any explanations.


Thanks.

Nick


Re: Some tests in gcc.c-torture rely on undefined behaviour?

2005-07-12 Thread Nicholas Nethercote

On Tue, 12 Jul 2005, Joseph S. Myers wrote:


My question is:  what exactly is gcc.c-torture testing?  It seems to be


That GNU C code compiles or executes as expected for GNU C.


Is there a definition for GNU C?  implement-c.texi and extend.texi have 
some information about this, are there any other sources?


Thanks.

Nick


RE: GNU Fortran Compiler

2005-08-05 Thread Nicholas Nethercote

On Fri, 5 Aug 2005, Dave Korn wrote:


Hallo,
what must I do for becomming the GNU Fortran Compiler?
Sincerely, Hans.


To become the compiler, you must _think_ like the compiler.


It's an easy mistake to make for Germans speaking English, because the 
German verb "bekommen" means "to get, obtain, receive"...


Nick


Re: More NEWS on GCC?

2005-08-30 Thread Nicholas Nethercote

On Tue, 30 Aug 2005, Rafael ?vila de Esp?ndola wrote:


One problem is that compiler technology generally requires more background
then OS:
1) the new O(1) scheduler
2) the new PCI interface
or
1) the new SSA based intermediate representation
2) the new DFA based pipeline hazard recognizer


I don't think the compiler technology is much more complicated to describe 
than the OS technology.  All four of those concepts would require at least 
a little background explanation to non-OS and non-compiler people.


Timothy has a good point.  GCC is arguably more important than the Linux 
kernel, yet it gets very little press and recognition.  Note also that 
everybody loves Linux (more or less) but people mostly bitch about GCC if 
they think about it at all.  Perhaps these facts are connected?


Nick

Re: Adding debug symbols causes segmentation faults with GCC-4.1 and MIPS...

2005-09-13 Thread Nicholas Nethercote

On Tue, 13 Sep 2005, Steven J. Hill wrote:


You might want to first make sure that your program has no memory
access errors.  You could try building it for x86 and debugging
with valgrind, to see if that catches anything.


A good idea. I built it for x86. Unfortunately, from the output it
appears that 'clone' is not supported, or rather not very well. Here
is a link to the source:

http://www.uclibc.org/cgi-bin/viewcvs.cgi/*checkout*/trunk/uClibc/test/unistd/clone.c?content-type=text%2Fplain&rev=10696

The only interesting output is:

==4032== Syscall param clone(parent_tidptr) contains uninitialised byte(s)
==4032==at 0x1BA108AC: clone (clone.S:100)
==4032==by 0x1B96C412: __libc_start_main (libc-start.c:250)
==4032==by 0x80484A0: ??? (start.S:119)
==4032==
==4032== Syscall param clone(tlsinfo) contains uninitialised byte(s)
==4032==at 0x1BA108AC: clone (clone.S:100)
==4032==by 0x1B96C412: __libc_start_main (libc-start.c:250)
==4032==by 0x80484A0: ??? (start.S:119)
==4032==
==4032== Syscall param clone(child_tidptr) contains uninitialised byte(s)
==4032==at 0x1BA108AC: clone (clone.S:100)
==4032==by 0x1B96C412: __libc_start_main (libc-start.c:250)
==4032==by 0x80484A0: ??? (start.S:119)
==4032==
==4032== Unsupported clone() flags: 0x0
==4032==
==4032== The only supported clone() uses are:
==4032==  - via a threads library (LinuxThreads or NPTL)
==4032==  - via the implementation of fork or vfork
==4032==  - for the Quadrics Elan3 user-space driver

I don't feel like I can trust the output since valgrind admitted it does
not do clone very well.


It handles the most common invocations of clone, but this program passes 
in 0 for 'flags', which seems odd.


Nick


Re: Wishlish: GCC option for explicit booleans

2005-10-02 Thread Nicholas Nethercote

On Sat, 1 Oct 2005, [EMAIL PROTECTED] wrote:


C++ would be a better language if the boolean type did not implicitly
convert from int. For example, many novice programmers make the
mistake.

  if (i = j) dosomething(); // Should be i == j

If conversion to boolean required explicit this would all be solved. It
would mean all the old code with expressions like "while (p) ... "
would need to be changed to "while (p != NULL) ...". But I think the
change would be well justified.

What about a GCC option to turn off implicit conversion to boolean?


[~] more a.cpp
int main(void)
{
int i = 0, j = 0;
if (i = j)
return 0;
else
return 1;
}
[~] g++ -Wall a.cpp
a.cpp: In function `int main()':
a.cpp:4: warning: suggest parentheses around assignment used as truth value

Nick


Re: Abnormal behavior of malloc in gcc-3.2.2

2005-11-21 Thread Nicholas Nethercote

On Mon, 21 Nov 2005, Giovanni Bajo wrote:


I didnt get your point. I am allocating space only for 400 inregers
then as soon as in the loop if it crosses the value of 400 , it should
have given a segementation voilation ?


No. For that to happen, you need some memory checker. GCC has -fmudflap, try
with that. Recent versions of glibc also have their internal memory buffer
checker, it probably triggers the segmentation fault when you free the buffer
which you have overflown.


Valgrind will find this error too (www.valgrind.org).

Nick


Re: Performance regression testing?

2005-11-29 Thread Nicholas Nethercote

On Mon, 28 Nov 2005, Joe Buck wrote:


On Mon, 28 Nov 2005, Mark Mitchell wrote:

We're collectively putting a lot of energy into performance 
improvements in GCC.  Sometimes, a performance gain from one patch gets 
undone by another patch -- which is itself often doing something else 
beneficial. People have mentioned to me that we require people to run 
regression tests for correctness, but that we don't really have 
anything equivalent for performance.


It would be possible to detect performance regression after fact, but
soon enough to look at reverting patches.  For example, given multiple
machines doing SPEC benchmark runs every night, the alarm could be raised
if a significant performance regression is detected.  To guard against
noise from machine hiccups, two different machines would have to report
a regression to raise the alarm.  But the big problem is the non-freeness
of SPEC; ideally there would be a benchmark that ... 


... everyone can download and run
... is reasonably fast
... is non-trivial


Yes!  This would be very useful for other free software projects.

Another possible requirement is that the tests are not too large;  it 
would be nice to include them in the source code of one's project for 
easier integration.



As a strawman, perhaps we could add a small integer program (bzip?) and
a small floating-point program to the testsuite, and have DejaGNU print
out the number of iterations of each that run in 10 seconds.


Would that really catch much?


I've been thinking about this kind of thing recently for Valgrind.  I was 
thinking that a combination of real programs and artificial 
microbenchmarks would be good.  The microbenchmarks would be like the GCC 
(correctness) torture tests -- a collection of programs, added to over 
time, each one demonstrating a prior performance bug.  You could start it 
off with a few tests containing things like key inner loops extracted from 
programs such as bzip2.


Measuring the programs and categorizing regressions is tricky.  It's 
possible that the artificial tests would be small enough that any 
regression would be obvious (eg. failing to remove that extra instruction 
would cause a 10% slowdown).  And CSiBE-style graphing is very effective 
for seeing trends.


Nick


Re: Echte Lokaliserung der Programmbausprache/ Real Localisation of Programming Language

2008-10-17 Thread Nicholas Nethercote

On Mon, 6 Oct 2008, Kai Henningsen wrote:


  You're not the first person to come up with this idea, and you
probably won't be the last, but it's a misbegotten idea, and there's


In fact, I believe it came up around the time when COBOL was invented.
And you'll notice that it didn't get implemented back then, even though
people thought it wouldn't be all that hard to do.


a very good reason why it hasn't been done before, and that's not


Actually, that's not true.

In my Apple ][+ days, I've seen it done with BASIC. For some reason, it
never amounted to more than a toy.

I'm sure someone somewhere is doing it to a programming language right
now. Poor language.


Early versions of AppleScript -- a "naturalistic" language with lots of 
keywords -- supported a french "dialect" and even Japanese.  See page 20 of 
http://www.cs.utexas.edu/~wcook/Drafts/2006/ashopl.pdf


AIUI, the foreign language support was dropped at some point.

Nick


Re: change to gcc from lcc

2008-11-19 Thread Nicholas Nethercote

On Tue, 18 Nov 2008, H.J. Lu wrote:


I used malloc to create my arrays instead of creating the in the stack. My 
program is working now but it is very slow.

I use two-dimensional arrays. The way I access element (i,j) is:
array_name[i*row_length+j]

The server that I use has 16GB ram. The ulimit -a command gives the following 
output:
time(seconds)unlimited
file(blocks) unlimited
data(kbytes) unlimited
stack(kbytes)8192



That limits stack to 8MB. Please change it to 1GB.


Why?

Nick


Re: Questions about another usage of GCOV

2006-06-18 Thread Nicholas Nethercote

On Sat, 17 Jun 2006, Marc Alff wrote:


2) Licensing

For technical reasons, I can not use the gcov library itself,
and plan to implement code to read/write the files the GCOV program needs.


Then why do you need to use the gcov file formats?

Nick


Re: [RFC] Program Bounds Checking

2006-09-28 Thread Nicholas Nethercote

On Thu, 28 Sep 2006, Tzi-cker Chiueh wrote:


We have considered the bound instruction in the CASH project. But
we found that bound instruction is slower than the six normal
instructions it is meant to replace for range checking. For example, the
bound instruction on a 1.1 GHz PIII machine requires 7-8 clock cycles
while the 6 equivalent instructions require 6-7 clock cycles. We have not
tested it on newer processors, though.


I would guess it would be as slow or worse.  'bound' is an extremely rarely 
used instruction, and so will not be optimised-for at all.


Nick


Re: a mudflap experiment on freebsd

2005-02-23 Thread Nicholas Nethercote
On Wed, 23 Feb 2005, Doug Graham wrote:
Regarding memory consumption, perhaps libmudflap's default backtrace
parameter should be set to zero, for both speed and space reasons.
If it's storing all the backtraces that is burning up all the memory,
another approach might be to keep a separate hash table for storing
backtraces, then hash new backtraces and see if the same backtrace already
exists from a previous call to malloc.  If so, no need to allocate a
new one.  That's essentially what the hprof Java profiler does, and it
works pretty well.  The average application might have many thousands
of mallocs, but only a few distinct backtraces.  Also, saving program
counters instead of symbolic names in the backtrace would probably save
a lot of memory, and might also make hashing the backtrace cheaper.
Or the strings containing the symbolic names could be internalized rather
than allocating a new string for every symbolic name in every backtrace.
That's pretty much what Valgrind does -- each backtrace is just N program 
counters, and there's a symbol table full of strings, more or less.  When 
a backtrace is printed, Valgrind looks up the symbol table, which is not 
particularly fast, but printing error messages is relatively rare.

With respect to backtrace sizes, Valgrind's default is 4 but almost 
everybody immediately sets that to something much higher.  Many people set 
it to something like 40 so as to get the biggest possible trace.  So I 
think that trying to get around the problem of memory consumption by 
making the default backtrace size small won't work.

N


Re: Memory leaks in compiler

2008-01-16 Thread Nicholas Nethercote

On Wed, 16 Jan 2008, Tom Tromey wrote:


Kaveh> A valgrind suppression only silences the error for valgrind.  What if
Kaveh> someone uses another memory checking tool?  Better to fix it for real
Kaveh> IMHO.

Add suppressions for all of them.  Any decent memory checker has to
account for the reality that many programs don't bother freeing memory
before exit.


Valgrind (Memcheck) generally only complains about unfreed blocks for which 
there is no pointer in live data, ie. truly leaked ones.  But it's not 
perfect, there are various ways it can be fooled.


Nick


Re: Official GCC git repository

2008-03-14 Thread Nicholas Nethercote

On Thu, 13 Mar 2008, David Woodhouse wrote:


I could never understand why anyone would use anything but CVS (if that
works for them), or git. The VCS-du-jour craze just confuses me.


Version control is complicated, much more so than it first appears.  There's 
a very large design space.  Knowing that, it's not surprising that there are 
so many different VCS systems, embodying lots of different approaches.


Nick


Re: Very Fast: Directly Coded Lexical Analyzer

2007-05-31 Thread Nicholas Nethercote

On Thu, 31 May 2007, Andrew Haley wrote:


No.  Speed is always measured in reciprocal units of time: s^-1.
A program that runs in 10 seconds has a speed of 0.1 s^-1.  Thus, 200%
is (0.1 * 200/100) s^-1 faster, giving a speed of 0.3 s^-1.


Um, 0.1 * 200/100 = 0.2.

Amdahl's Law says:

  speedup = t_old / t_new

Yet people often express it as a percentage, which is confusing, as we've 
just seen.  They write "1.2x faster" as "20% faster" and "3x faster" as 
"200% faster".


Another performance measure is reduction in execution time:

  reduction = 1 - (t_new / t_old)

That is more natural to express as a percentage.  But it's usually not what 
you want -- it's not what most people mean intuitively when they say "X is 
faster than Y" by a particular amount.


Growth of things is often expressed as a percentage, eg. saying "my wealth 
is 200% larger than last year" is fairly clear as meaning "3x larger". 
It's clear because there's only one possible measurement:  size.  That's why 
people use it in finance.  But when you're dealing with how fast programs 
run, because there are two possible measures ("speedup" and "reduction") a 
percentage is much less clear.


To confuse things further, if you use percentage for speedup, the two 
measures give very similar results, *for small performance improvements*. 
For example:


  t_old = 100s
  t_new = 90s
  speed-up = 1.11 (sometimes strangely written as 11%)
  reduction = 10%

But they diverge as the performance improvements get better.  I once saw a 
talk in which the speaker said he had sped up a system by 75%.  He really 
meant he had reduced the execution time by 75%, which meant it was running 
4x faster.  But I think everyone assumed a 1.75x speed-up until I asked for 
a clarification.  Since this "talk" was actually a thesis defense I think my 
question helped him significantly :)


It's amazing how much academic work doesn't make it clear what they mean by 
"10% faster".  And the responses to this thread show how confusing 
percentages are.  I encourage everyone to use Amdahl's law for speed-up and 
avoid percentages when talking about program performance.  If you say "I 
made it 3x faster" no-one will ever be confused.


Nick


Re: RFH: GPLv3

2007-07-13 Thread Nicholas Nethercote

On Fri, 13 Jul 2007, Alexandre Oliva wrote:


One way to view it:  the license is a feature.  Therefore changing the
license is changing a feature.


Every release of GCC in the past decade (and then some) was GPLv2+.
GPLv3 has always been one of the options.

Anyone who had their heads in the sand for the past 18 months when
GPLv3 was being publicly discussed and developed, or wasn't at the GCC
Summit last year when I mentioned that the FSF would most certainly
want to upgrade the license of every project whose copyright it held
as soon as GPLv3 was ready, may indeed consider the license upgrade as
a surprising new feature.

But anyone who wanted to participate was welcome to do so, and GPLv3
shouldn't be a surprise for anyone who did, or even just watched it
from a distance.

Now, why should we weaken our defenses for the sake of those who
didn't plan for something that could have been so easily forecast 18
months ago, and that was even planned to be finished 4 months ago?
Heck, the last-call draft, published one month before the final
release, was so close to the final release that non-insider lawyers
who were on top of the process managed to emit solid opinions about
the final license the day after it was released.

It's those who didn't do their homework and didn't plan ahead for this
predictable upgrade who should be burdened now, rather than all of us
having to accept weaker defenses for our freedoms or facing additional
requirements on patches or backports.  It was all GPLv2+, and this
means permission for *anyone* to upgrade to GPLv3+.  The license
upgrade path is the easy path, and that's by design.


I was just suggesting a rationale for choosing a version number.

Nick


Re: RFH: GPLv3

2007-07-13 Thread Nicholas Nethercote

On Thu, 12 Jul 2007, Michael Eager wrote:


3. After GCC 4.2.1 is released, we will renumber the branch to GCC 4.3.
  What would have been GCC 4.2.2 will instead be GCC 4.3.3, to try to
emphasize the GPLv3 switch.  The GCC mainline will then be GCC 4.4.


This seems to confabulate the meaning of version numbers to
now mean something about licensing.   The difference between
4.2.1 and 4.2.2 would normally be considered a minor bug fix
release, under this scheme of calling it 4.3.3, one would be
misled to think that this is a minor bug fix for a non-existent
minor release.

The version numbering scheme correlating to functional changes
is more valuable than any (IMO insubstantial) benefit of
identifying the change in license version.


One way to view it:  the license is a feature.  Therefore changing the 
license is changing a feature.  Therefore what was going to be 4.2.2 should 
become 4.3.0.


Nick


Re: Rant about ChangeLog entries and commit messages

2007-12-02 Thread Nicholas Nethercote

On Sun, 2 Dec 2007, Andreas Schwab wrote:


| 2007-11-30  Jan Hubicka  <[EMAIL PROTECTED]>
|
| * ggc-common.c (dump_ggc_loc_statistics): Reset ggc_force_collect
| flag.

How could a newcomer guess why the gcc_force_collect flag needs to be
reset?


That is supposed to be written in a comment.


Indeed.  Some advice I once wrote:  Often I see a commit with a log message 
that lovingly explains a small change made to fix a subtle problem, but adds 
no comments to the code.  Don't do this! Put that careful description in a 
comment, where people can actually see it.  (Commit logs are basically 
invisible; even if they are auto-emailed to all developers, they are soon 
forgotten, and they don't benefit people not on the email list.)  That 
comment is not a blemish but an invaluable record of an unusual case that 
someone didn't anticipate.  If the bug-fix was pre-empted by a lengthy email 
exchange, include some or all of that exchange if it helps.


Nick


Re: Rant about ChangeLog entries and commit messages

2007-12-03 Thread Nicholas Nethercote

On Mon, 3 Dec 2007, Andi Kleen wrote:


Commit logs are basically invisible;


That's just a (fixable) problem in your coding setup. In other
projects it is very common to use tools like cvs annotate / cvsps /
git blame / git log / etc. to find the reasons for why code is the way
it is. In fact in several editors these can be functions on hot
keys. Programming is hard enough as is without ignoring such valuable
information sources. Don't do it.


I didn't say you cannot or should not use these tools.  But a good comment 
on a piece of code sure beats a good commit message, which must be looked at 
separately, and can be fragmented over multiple commits, etc.


Nick


Re: [RFC] WHOPR - A whole program optimizer framework for GCC

2007-12-13 Thread Nicholas Nethercote

On Wed, 12 Dec 2007, J.C. Pizarro wrote:


[...]

* "executable" means "it's from an execution to death of the e-prisoner"?

* "Indirect call promotion" means "this promotion indirectly e?"?

* "Dead variable elimination" means "elimination variable of R.I.P.s"?

* etc.

  J.C.Pizarro i though that the Apocalypsis is near.


My theory is that J.C.Pizarro is an advanced AI chat-bot designed to produce 
streams of nearly-intelligible programming-related verbiage, and that last 
email was the result of a malfunction that caused it to dump part of its 
internal word association database.


Nick