Hi,

a long time ago, during development of [tuples] gimple_statement_base had 
prev/next links.  Those were moved to gimple_seq_node, referred to from 
gimple_seq, and referring to gimple statements, on the grounds that this 
eases experimentation with different data structures.  Well, those 
experiments never happened, and meanwhile we still suffer cache thrashing 
and memory waste due to that setup.  See also 
http://gcc.gnu.org/ml/gcc-patches/2008-02/msg00794.html and the up/down 
thread discussions there.

We observe:
* every gimple is in a sequence (except for very short intermediate 
  times), hence there's a gimple_seq_node for every gimple.  It has three 
  pointers (prev/next/stmt).
* a gimple_seq is also three pointers (head/tail/freelist)
* allocation/deallocation of gimple_seq is optimized in a peculiar way, 
  via a free list, which is strange because gimple_seq_node (which are 
  allocated much more often than gimple_seq, necessarily so, because
  there's at least one node for every non-empty seq) are not optimized
  this way
* most gimple statements are member of _at most_ one sequence, and where
  this isn't the case it's easy to shuffle the code a bit to make it so.
  I.e. the facility that multiple nodes could point to the same gimple 
  aren't really necessary.
* there's no easy stmt to node mapping, hence gsi_for_stmt is O(n)

So, let's reshuffle everything like so:
* ensure gimples are in exactly one sequence always
* make gimple be the node itself (include a prev/next pair), getting rid
  of one pointer per gimple
* make gimple also be a seq itself (getting rid of the freelist and three 
  pointers per sequence; I haven't measured how many there are)

The latter will be achieved by cleverly using the prev pointer of the 
first gimple statement in a sequence (which would be NULL with the 
traditional NIL-ended lists) as pointer to the last statement of the whole 
sequence.  A consequence will be that the prev links form a cyclic list.  
The next links will still be a traditional NULL-ended one.  The latter is 
good because so we can easily detect the last element of a sequence 
(->next is NULL) from the element itself, and hence also the first element 
(->prev->next is NULL).  Via this we could improve a bit on iterators in 
some circumstances.

So, after these changes a gimple_seq doesn't need to be allocated anymore, 
it will either be NULL (an empty seq) or be a gimple statement itself (the 
first of that very seq).

The patch series as replies to this mail implements the above.  It will 
not go over the whole compiler and do a s/gimple_seq_node/gimple/ and 
s/gimple_seq/gimple/, i.e. we retain the possibility to implement those 
types in some other ways in the future.  E.g. the functions in 
gimple-iterator.c will continue to talk about gimple_seq_node, as well as 
gimple.h functions dealing with gimple_seq (some will take a pointer to a 
seq, not the seq itself anymore).  But they will merely by typedefs to 
gimple.

I'll say that this split into six parts was done after I've written the 
bulk of the thing already.  Hence the individual parts are _not_ 
bootstrapped and regtested, I only made sure that tree-ssa.exp (for C and 
C++) works for every patch in sequence.  The whole series together is 
regstrapped on x86_64-linux (all languages, plus Ada, objc++) without 
regressions.  Therefore I consider applying only the whole series as one 
commit to not break regression hunters, i.e. the split is rather for 
reviewing pleasure.

Now, as for measurements.  With an unpatched and a patched compiler, both 
compiled with -O2 by my system compiler (i.e. not bootstrapped), with some 
largish testcases, time and memory:

           unpatched                    patched
           time     GCmem    maxmem     time      GCmem    maxmem
kdecore.cc 114.05s  948770k  313093k    108.76s   940860k  307865k
big-code.c  60.63s  324343k  159477k     53.73s   321635k  157941k

So, max memory improves by 1% to 2% (GCmem is the added overall garbage 
collected memory), and compile time by 5% to 10% (!).  Note that I haven't 
configured that compiler with disable-checking (though I _added_ some 
asserts :) ).  I have lost my bootstrap time numbers meanwhile, but 
bootstrap time with patch improved by only 1%, but still measurable.

So, looking forward to approvals/comments for the patches.


Ciao,
Michael.

Reply via email to