Hello,
my last attempt on improving something serious was about three weeks ago,
trying to keep all lengths of all strings parsed in the frontend for the
whole compilation phase until the assembly output. I was hoping that would
help on using faster hashes (knowing the length allows us to hash
Hi,
while I was happy using obstacks in other parts of the compiler I thought
they would provide a handy solution for the XNEWVECs/XRESIZEVECs in
df-scan.c, especially df_install_refs() which is the heaviest malloc()
user after the rest of my patches.
In the process I realised that obstacks
Hi Paolo,
On Mon, 20 Aug 2012, Paolo Bonzini wrote:
Il 19/08/2012 18:55, Richard Guenther ha scritto:
Initially I had one obstack per struct graph, which was better than using
XNEW for every edge, but still obstack_init() called from new_graph() was
too frequent.
So in this iteration of the pa
On Mon, 20 Aug 2012, Jakub Jelinek wrote:
I'd note for all the recently posted patches from Dimitrios, the gcc/
prefix doesn't belong to the ChangeLog entry pathnames, the filenames are
relative to the corresponding ChangeLog location.
Ah sorry, it's what the mklog utility generates, it seems
Hi Steven,
On Sun, 19 Aug 2012, Steven Bosscher wrote:
On Sun, Aug 19, 2012 at 8:31 PM, Dimitrios Apostolou wrote:
Hello,
2012-08-19 Dimitrios Apostolou
* gcc/cselib.c (cselib_init): Make allocation pools larger since
they are too hot and show to expand often on the
Hello,
2012-08-19 Dimitrios Apostolou
* gcc/cselib.c (cselib_init): Make allocation pools larger since
they are too hot and show to expand often on the profiler.
* gcc/df-problems.c (df_chain_alloc): Same.
* gcc/et-forest.c (et_new_occ, et_new_tree): Same
2012-08-19 Dimitrios Apostolou
* gcc/tree-ssa-pre.c (phi_translate_pool): New static global
alloc_pool, used for allocating struct expr_pred_trans_d for
phi_translate_table.
(phi_trans_add, init_pre, fini_pre): Use it, avoids thousand of
malloc() and
2012-08-19 Dimitrios Apostolou
* gcc/tree-ssa-structalias.c: Change declaration of ce_s type
vector from heap to stack. Update all relevant functions to
VEC_alloc() such vector upfront with enough (32) slots so that
malloc() calls are mostly avoided
Hi,
2012-08-18 Dimitrios Apostolou
* gcc/tree-ssa-sccvn.c (struct vn_tables_s): Add obstack_start to
mark the first allocated object on the obstack.
(process_scc, allocate_vn_table): Use it.
(init_scc_vn): Don't truncate shared_lookup_references v
acceptable, and also where I should initialise the obstack once, and avoid
checking if it's NULL in every use.
Minor speed gains (couple of ms), tested with pre-C++ conversion snapshot,
I'll retest soon and post update.
Thanks,
Dimitris
2012-08-18 Dimitrios Apostolou
documented way of allocating macros.
2012-08-18 Dimitrios Apostolou
* include/libiberty.h (XOBDELETE, XOBGROW, XOBGROWVEC, XOBSHRINK)
(XOBSHRINKVEC, XOBFINISH): New type-safe macros for obstack
operations.
(XOBFINISH): Changed to return (T *) instead of T. All
2012-08-18 Dimitrios Apostolou
* dwarf2out.c (output_indirect_string): Use
ASM_OUTPUT_INTERNAL_LABEL instead of slower ASM_OUTPUT_LABEL.
* varasm.c (assemble_string): Don't break string in chunks, this
is assembler specific and already done in most versio
Hello list,
for the following couple of days I'll be posting under this thread my
collection of patches.
Unless otherwise mentioned they've been bootstrapped and tested on x86,
but with a three-weeks old snapshot, that is pre-C++ conversion. I plan to
test again next week with a latest snaps
Hi,
On Fri, 17 Aug 2012, Jakub Jelinek wrote:
On Fri, Aug 17, 2012 at 06:41:37AM -0500, Gabriel Dos Reis wrote:
I am however concerned with:
static void
store_bindings (tree names, VEC(cxx_saved_binding,gc) **old_bindings)
{
! static VEC(tree,heap) *bindings_need_stored = NULL;
I wo
On Tue, 7 Aug 2012, Ian Lance Taylor wrote:
On Tue, Aug 7, 2012 at 2:24 PM, Dimitrios Apostolou wrote:
BTW I can't find why ELF_STRING_LIMIT is only 256, it seems GAS supports
arbitrary lengths. I'd have to change my code if we ever set it too high (or
even unlimited) since I al
I should mention that with my patch .ascii is used more aggresively than
before, so if a string is longer than ELF_STRING_LIMIT it will be written
as .ascii all of it, while in the past it would use .string for the
string's tail. Example diff to original behaviour:
.LASF15458:
- .ascii
On Mon, 6 Aug 2012, Ian Lance Taylor wrote:
On Mon, Aug 6, 2012 at 9:34 PM, Dimitrios Apostolou wrote:
As an addendum to my previous patch, I made an attempt to properly add
strnlen() to libiberty, with the code copied from gnulib. Unfortunately it
seems I've messed it up somewhere
As an addendum to my previous patch, I made an attempt to properly add
strnlen() to libiberty, with the code copied from gnulib. Unfortunately it
seems I've messed it up somewhere since defining HAVE_STRNLEN to 0 doesn't
seem to build strnlen.o for me. Any ideas?
Thanks,
Dimitris
=== modified
strapped on x86, no regressions for C,C++ testsuite.
Thanks Andreas, hp, Mike, for your comments. Mike I'd appreciate if you
elaborated on how to speed-up sprint_uw_rev(), I don't think I understood
what you have in mind.
Thanks,
Dimitris2012-08-07 Dimitrios Apostolou
On Sat, 4 Aug 2012, Ian Lance Taylor wrote:
On Fri, 3 Aug 2012, Ian Lance Taylor wrote:
I'm not sure where you are looking. I only see one call to
_obstack_begin in the gcc directory, and it could easily be replaced
with a call to obstack_specify_allocation instead.
In libcpp/ mostly, but o
On Fri, 3 Aug 2012, Ian Lance Taylor wrote:
2012-08-04 Dimitrios Apostolou
* libiberty.h
(XOBDELETE,XOBGROW,XOBGROWVEC,XOBSHRINK,XOBSHRINKVEC): New
type-safe macros for obstack allocation.
(XOBFINISH): Renamed argument to PT since it is a pointer to T
nd I'll back out the patch.
2012-08-04 Dimitrios Apostolou
* libiberty.h
(XOBDELETE,XOBGROW,XOBGROWVEC,XOBSHRINK,XOBSHRINKVEC): New
type-safe macros for obstack allocation.
(XOBFINISH): Renamed argument to PT since it is a pointer to T.
=== modified file
On Thu, 19 Jul 2012, Richard Guenther wrote:
I don't think it's any good or clearer to understand.
Hi Richi, I had forgotten I prepared this for PR #19832, maybe you want to
take a look. FWIW, with my patch applied there is a difference of ~3 M
instr, which is almost unmeasurable in time. Bu
I'm always forgetting something, now it was the changelog, see attached
(same as old, nothing significant changed).
On Fri, 3 Aug 2012, Dimitrios Apostolou wrote:
Hi, I've updated this patch to trunk and rebootstrapped it, so I'm
resubmitting it, I'm also making a tr
l come in a separate
patch. The notes quoted from earlier mail still apply:
On Sun, 8 Jul 2012, Dimitrios Apostolou wrote:
Hi,
This patch adds many nice stats about hash tables when gcc is run with
-fmem-report. Attached patch tested on x86, no regressions.
Also attached is sample output
ys be?).
Bootstrapped/tested on i386, regtested on x86_64 multilib,
i386-pc-solaris2.10 (thanks ro), i686-darwin9 (thanks iains).
2012-07-09 Dimitrios Apostolou
* final.c, output.h (fprint_w): New function to write a
HOST_WIDE_INT to a file, fast.
* final.c (output_addr_
Hello,
I've had this patch some time now, it's simple and cosmetic only, I
had done it while trying to understand expression costs in CSE. I
think it's more readable than the previous one. FWIW it passed all tests
on x86.
Thanks,
Dimitris=== modified file 'gcc/cse.c'
--- gcc/cse.c 2012-06
With the attached patches I introduce four new obstacks in struct
cpp_reader to substitute malloc's/realloc's when expanding macros. Numbers
have been posted in the PR, but to summarize:
before: 0.785 s or 2201 M instr
after: 0.760 s or 2108 M instr
Memory overhead is some tens kilobytes wo
g) table->size);
+
+ fprintf (stderr, "\tused\t\t%lu (%.2f%%)\n",
+ (unsigned long) table->n_elements,
+ table->n_elements * 100.0 / table->size);
+ fprintf (stderr, "\t\tvalid\t\t%lu\n",
+ (unsigned long) n_valid);
+ fprintf (stderr, &q
Hi Dodji,
On Mon, 4 Jun 2012, Dodji Seketeli wrote:
Hello Dimitrios,
I cannot approve or deny your patch, but I have one question.
Who should I CC then? I saw that you have commits in that file.
I am wondering why this change implies better performance.
Is this because when we later want
t was zeroed in every macro. Maybe there
are pathological cases that I don't see?
2012-06-04 Dimitrios Apostolou
* line-map.c (linemap_enter_macro): Don't zero max_column_hint in
every macro. This improves performance by reducing the number of
reallocations w
On Tue, 22 May 2012, Paolo Bonzini wrote:
Il 22/05/2012 18:26, Dimitrios Apostolou ha scritto:
You are right, and I noticed that if we reverse (actually put straight)
the loop for the PARALLEL defs inside df_defs_record() then the speedup
stands for both x86 and ppc64.
The following patch
On Tue, 22 May 2012, Paolo Bonzini wrote:
Il 21/05/2012 19:49, Dimitrios Apostolou ha scritto:
Thanks for reviewing, in the meantime I'll try to figure out why this
patch doesn't offer any speed-up on ppc64 (doesn't break anything
though), so expect a followup by tomorrow.
Hi Paolo,
On Mon, 21 May 2012, Paolo Bonzini wrote:
Il 20/05/2012 20:50, Dimitrios Apostolou ha scritto:
Paolo: I couldn't find a single test-case where the mw_reg_pool was
heavily used so I reduced its size. You think it's OK for all archs?
Makes sense, we can see if someth
One line patch to update Makefile.
2012-05-21 Dimitrios Apostolou
* gcc/Makefile.in: (toplev.o) toplev.o depends on cselib.h.
=== modified file 'gcc/Makefile.in'
--- gcc/Makefile.in 2012-05-04 20:04:47 +
+++ gcc/Makefile.in 2012-05-21 14:08:45 +
@@ -2751
ion than a
run-time one. This is probably an overkill so I think I'll skip it.
Thanks,
Dimitris
2012-05-21 Dimitrios Apostolou
Print various statistics about hash tables when called with
-fmem-report. If the tables are created once use
htab_dump_statistics(), if t
0.720s
Tested on i686, ppc64. No regressions.
Paolo: I couldn't find a single test-case where the mw_reg_pool was
heavily used so I reduced its size. You think it's OK for all archs?
2012-05-20 Dimitrios Apostolou
Paolo Bonzini
Provide almost 2% speedup on
Hi,
On Sat, 12 Nov 2011, Eric Botcazou wrote:
We just need to declare it in system.h in order to use the definition in
libiberty.
OK, this should be fine.
do the patches I sent for bug #51094 solve the problems?
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51094
Thanks,
Dimitris
Hi David,
I couldn't imagine such breakage... If too many platforms break perhaps we
should undo the optimisation - see attached patch.
Thanks,
Dimitris
P.S. see also bug #51094 I've attached some more fixes
=== modified file 'gcc/config/elfos.h'
--- gcc/config/elfos.h 2011-10-30 01:45:46
On Mon, 7 Nov 2011, Jakub Jelinek wrote:
On Mon, Nov 07, 2011 at 12:01:29AM +0200, Dimitrios Apostolou wrote:
On Sun, 6 Nov 2011, Joern Rennecke wrote:
But where HARD_REG_SETS make no material difference in speed, and the
compilation unit has no other tight coupling with tm.h, it would really
On Sun, 6 Nov 2011, Joern Rennecke wrote:
But where HARD_REG_SETS make no material difference in speed, and the
compilation unit has no other tight coupling with tm.h, it would really
be cleaner to move from HARD_REG_SETS to a target-independent type,
like sbitmap or bitmap. Maybe we want someth
l GNU assembler with -s parameter, though it's pretty hard to be
compliant.
* Even further in the future we could generate binary data, if we *know* the
assembler is GAS.
Slightly more descriptive changelog:
2011-08-12 Dimitrios Apostolou
* final.c, output.h (fprint_whex, fprint_w,
them will
make it into 4.7?
On Mon, 22 Aug 2011, Dimitrios Apostolou wrote:
For the record I'm posting here the final version of this patch, in case it
gets applied. It adds minor stylistic fixes, plus a small change in
alloc_pool sizes. Any further testing I do will be posted under
On Tue, 23 Aug 2011, Jakub Jelinek wrote:
On Tue, Aug 23, 2011 at 02:40:56PM +0300, Dimitrios Apostolou wrote:
dst->vars = (shared_hash) pool_alloc (shared_hash_pool);
dst->vars->refcount = 1;
dst->vars->htab
-= htab_create (MAX (src1_elems, src2_elems), var
Hi jakub,
On Mon, 22 Aug 2011, Jakub Jelinek wrote:
On Mon, Aug 22, 2011 at 01:30:33PM +0300, Dimitrios Apostolou wrote:
@@ -1191,7 +1189,7 @@ dv_uid2hash (dvuid uid)
static inline hashval_t
dv_htab_hash (decl_or_value dv)
{
- return dv_uid2hash (dv_uid (dv));
+ return (hashval_t
On Mon, 22 Aug 2011, Dimitrios Apostolou wrote:
Hi Steven,
On Mon, 1 Aug 2011, Steven Bosscher wrote:
On Sun, Jul 31, 2011 at 11:59 PM, Steven Bosscher
wrote:
On Fri, Jul 29, 2011 at 11:48 PM, Steven Bosscher
wrote:
I'll see if I can test the patch on the compile farm this weekend,
ju
Hi Steven,
On Mon, 1 Aug 2011, Steven Bosscher wrote:
On Sun, Jul 31, 2011 at 11:59 PM, Steven Bosscher wrote:
On Fri, Jul 29, 2011 at 11:48 PM, Steven Bosscher wrote:
I'll see if I can test the patch on the compile farm this weekend,
just to be sure.
Bootstrap on
ia64-unknown-linux-gnu is
parameter, though it's pretty hard to be
compliant.
* Even further in the future we could generate binary data, if we *know*
the assembler is GAS.
Changelog:
2011-08-12 Dimitrios Apostolou
* final.c, output.h (fprint_whex, fprint_w, fprint_ul, sprint_ul):
New func
I should note here that specialised hash-tables in pointer-set.c have a
load-factor of at most 25%. Also another very fast hash table I've
studied, dense_hash_map from google's sparse_hash_table, has a load factor
of 50% max.
As I understand it a good hash function gives a perfectly random val
Hello,
the attached patch applies after my previous one, and actually cancels all
runtime gains from it. It doesn't make things worse than initially, so
it's not *that* bad.
While trying to understand var-tracking I deleted the whole shared hash
table concept and some other indirections. It
completion I'll also post a follow-up patch where I
delete/simplify a big part of var-tracking, unfortunately with some impact
on performance.
2011-08-22 Dimitrios Apostolou
* var-tracking.c (init_attrs_list_set): Remove function, instead
use a memset() call to zero t
On Mon, 22 Aug 2011, Richard Guenther wrote:
On Mon, Aug 22, 2011 at 9:46 AM, Dimitrios Apostolou wrote:
2011-08-22 Dimitrios Apostolou
* tree-ssa-structalias.c (equiv_class_add)
(perform_var_substitution, free_var_substitution_info): Created a
new equiv_class_pool
sing hash table much simpler, and I think there is only one
case we actually delete strings. Have to look further into this one.
All comments welcome,
Dimitris
Changelog:
2011-08-22 Dimitrios Apostolou
* cgraph.c, cgraph.h (cgraph_dump_stats): New function to dump
Attached patch is also posted at bug #19832 and I think resolves it, as
well as /maybe/ offers a negligible speedup of 3-4 M instr or a couple
milliseconds. I also post it here for comments.
2011-08-13 Dimitrios Apostolou
* cse.c (preferable): Make it more readable and slightly faster
For whoever is concerned about memory usage, I didn't measure a real
increase, besides a few KB. These are very hot allocation pools and
allocating too many blocks of 10 elements is suboptimal.
2011-08-22 Dimitrios Apostolou
* cselib.c (cselib_init): Increased initial si
Hi Jakub,
I forgot to mention that all patches are against mid-July trunk, I was
hoping I'd have no conflicts. Anyway thanks for letting me know,
if there are conflicts with my other patches please let me know, and I'll
post an updated version at a later date.
All your other concerns are val
Forgot the patch...
On Mon, 22 Aug 2011, Dimitrios Apostolou wrote:
2011-08-22 Dimitrios Apostolou
* tree-ssa-structalias.c (equiv_class_add)
(perform_var_substitution, free_var_substitution_info): Created a
new equiv_class_pool allocator pool for struct
2011-08-22 Dimitrios Apostolou
* tree-ssa-pre.c (phi_trans_add, init_pre, fini_pre): Added a pool
for phi_translate_table elements to avoid free() calls from
htab_delete().
=== modified file 'gcc/tree-ssa-pre.c'
--- gcc/tree-ssa-pre.c 2011-05-04 09:0
2011-08-22 Dimitrios Apostolou
* tree-ssa-structalias.c (equiv_class_add)
(perform_var_substitution, free_var_substitution_info): Created a
new equiv_class_pool allocator pool for struct
equiv_class_label. Changed the pointer_equiv_class_table and
2011-08-22 Dimitrios Apostolou
Allocate some very frequently used vectors on the stack:
* vecir.h: Defined a tree vector on the stack.
* tree-ssa-sccvn.c (print_scc, sort_scc, process_scc)
(extract_and_process_scc_for_name): Allocate the scc vector on the
free() was called way too often before, this patch reduces it
significantly. Minor speed-up here too, I don't mention it individually
since numbers are within noise margins.
2011-08-22 Dimitrios Apostolou
* graphds.h (struct graph): Added edge_pool as a pool for
alloc
2011-08-22 Dimitrios Apostolou
* emit-rtl.c (mem_attrs_htab_hash): Hash massively by calling
iterative_hash(). We disregard the offset,size rtx fields of the
mem_attrs struct, but overall this hash is a *huge* improvement to
the previous one, it reduces the
Hello list,
the followup patches are a selection of minor changes introduced in
various times during my GSOC project. They mostly are simple or
not that important to be posted alone, so I'll post them alltogether under
this thread. Nevertheless they have been carefully selected from a pool of
On Fri, 19 Aug 2011, Tom Tromey wrote:
I think you are the most likely person to do this sort of testing.
You can use machines on the GCC compile farm for this.
Your patch to change the symbol table's load factor is fine technically.
I think the argument for putting it in is lacking; what I wou
On Tue, 9 Aug 2011, Tom Tromey wrote:
"Richard" == Richard Guenther writes:
The libcpp part is ok with this change.
Richard> Note that sparsely populated hashes come at the cost of increased
Richard> cache footprint. Not sure what is more important here though, memory
Richard> access or h
I forgot to include the dwarf2out.c:file_table. Stats are printed when -g.
See attached patch. Additional Changelog:
* dwarf2out.c (dwarf2out_finish): Call htab_dump_statistics() if
-fmem-report.
Dimitris
=== modified file 'gcc/dwarf2out.c'
--- gcc/dwarf2out.c 2011-06-06 1
hangelog:
2011-08-09 Dimitrios Apostolou
* cgraph.c, cgraph.h (cgraph_dump_stats): New function to dump
stats about cgraph_hash hash table.
* cselib.c, cselib.h (cselib_dump_stats): New function to dump
stats about cselib_hash_table.
* cselib.c (cselib_finis
ample, for the
mem_attrs_htab hash table, coll/searches ratio is still sometimes higher
than 0.5.
Changelog:
2011-08-09 Dimitrios Apostolou
* symtab.c (ht_lookup_with_hash): Hash table will now be doubled
when 50% full, not 75%, to reduce collisions.
*
On Mon, 1 Aug 2011, Paolo Bonzini wrote:
On 08/01/2011 05:57 PM, Dimitrios Apostolou wrote:
I don't fully understand the output from -fdump-tree-all, but my
conclusion based also on profiler output and objdump, is that both
unrolling and inlining is happening in both versions. Neverthel
On Sun, 31 Jul 2011, Paolo Bonzini wrote:
On Sat, Jul 30, 2011 at 19:21, Dimitrios Apostolou wrote:
Nevertheless I'd appreciate comments on whether any part of this patch is
worth keeping. FWIW I've profiled this on i386 to be about 4 M instr slower
out of ~1.5 G inst. I'll be n
On Sun, 31 Jul 2011, Steven Bosscher wrote:
On Fri, Jul 29, 2011 at 11:48 PM, Steven Bosscher wrote:
I'll see if I can test the patch on the compile farm this weekend,
just to be sure.
Worked fine with some cross-builds to arm-eabi. Bootstrap on
ia64-unknown-linux-gnu is in stage2 but it is
Hello list,
the attached patch changes hard-reg-set.h in the following areas:
1) HARD_REG_SET is now always a struct so that it can be used in files
where we don't want to include tm.h. Many thanks to Paolo for providing
the idea and the original patch.
2) Code for specific HARD_REG_SET_LONG
On Fri, 29 Jul 2011, Kenneth Zadeck wrote:
i really think that patches of this magnitude having to with the rtl level
should be tested on more than one platform.
I'd really appreciate further testing on alternate platforms from whoever
does it casually, for me it would take too much time to s
Completely forgot it: Tested on i386, no regressions.
Dimitrios
improvements are
welcome.
http://gcc.gnu.org/wiki/OptimisingGCC?action=AttachFile&do=view&target=callgrind-tcp_ipv4-trunk-co-109439-prod.txt
http://gcc.gnu.org/wiki/OptimisingGCC?action=AttachFile&do=view&target=callgrind-tcp_ipv4-df2-co-prod.txt
Changelog:
2011-07-29 Dim
Bug found at last, it's in the following hunk, the ampersand in
&exit_block_uses is wrong... :-@
@@ -3951,7 +3949,7 @@ df_get_exit_block_use_set (bitmap exit_b
{
rtx tmp = EH_RETURN_STACKADJ_RTX;
if (tmp && REG_P (tmp))
- df_mark_reg (tmp, exit_block_uses);
+ df_
Dimitrios Apostolou
* hard-reg-set.h (TEST_HARD_REG_BIT, SET_HARD_REG_BIT,
CLEAR_HARD_REG_BIT): Added some assert checks for test, set and clear
operations of HARD_REG_SETs, enabled when RTL checks are on. Runtime
overhead was measured as negligible.
Thanks,
Dimitris=== modified file
That was a bug, indeed, but unfortunately it wasn't the one causing the
crash I posted earlier... Even after fixing it I get the same backtrace
from gdb.
So the petition "spot the bug" holds...
Thanks,
Dimitris
un 10) execution test
FAIL: libmudflap.cth/pass39-frag.c (-O3) (rerun 10) output pattern test
Performance measured not to be affected, maybe it is now a couple
milliseconds faster:
Original: PC1:0.878s, PC2:6.55s, 2105.6 M instr
Patched : PC1:0.875s, PC2:6.54s, 2104.9 M instr
2011-07-25 Dimi
Bug found, in df_mark_reg I need to iterate until regno + n, not n. The
error is at the following hunk:
--- gcc/df-scan.c 2011-02-02 20:08:06 +
+++ gcc/df-scan.c 2011-07-24 17:16:46 +
@@ -3713,35 +3717,40 @@ df_mark_reg (rtx reg, void *vset)
if (regno < FIRST_PSEUDO_REGIST
Hi Steven,
On Sun, 24 Jul 2011, Steven Bosscher wrote:
Can you please create your patches with the -p option, so that it's
easier to see what function you are changing? Also, even for an RFC
patch a ChangeLog is more than just nice to have ;-)
Do you mean an entry in Changelog file in root dir
On Fri, 8 Jul 2011, Paolo Bonzini wrote:
On 07/08/2011 05:51 AM, Dimitrios Apostolou wrote:
+ /* first write DF_REF_BASE */
This is not necessary. These uses are written to use_vec, while the uses
from REG_EQUIV and REG_EQUAL are written to eq_use_vec (see
df_ref_create_structure
Thanks Paolo for the detailed explanation!
On Fri, 8 Jul 2011, Paolo Bonzini wrote:
That said, changing exit_block_uses and entry_block_defs to HARD_REG_SET would
be a nice cleanup, but it would also touch target code due to
targetm.extra_live_on_entry (entry_block_defs);
I've already done
On Fri, 8 Jul 2011, Paolo Bonzini wrote:
On 07/08/2011 12:43 PM, Richard Sandiford wrote:
The docs also say that the first expr_list can be null:
If @var{lval} is a @code{parallel}, it is used to represent the case of
a function returning a structure in multiple registers. Each element
On Fri, 8 Jul 2011, Richard Guenther wrote:
On Fri, Jul 8, 2011 at 5:20 AM, Dimitrios Apostolou wrote:
Hello list,
The attached patch does two things for df_get_call_refs():
* First it uses HARD_REG_SETs for defs_generated and
regs_invalidated_by_call, instead of bitmaps. Replacing in total
On Fri, 8 Jul 2011, Jakub Jelinek wrote:
On Fri, Jul 08, 2011 at 06:20:04AM +0300, Dimitrios Apostolou wrote:
The attached patch does two things for df_get_call_refs():
* First it uses HARD_REG_SETs for defs_generated and
regs_invalidated_by_call, instead of bitmaps. Replacing in total
more
On Fri, 8 Jul 2011, Steven Bosscher wrote:
On Fri, Jul 8, 2011 at 5:20 AM, Dimitrios Apostolou wrote:
The attached patch does two things for df_get_call_refs():
How did you test this patch?
Normally, a patch submission comes with text like, "Bootstrapped &
tested on ..., no re
And here is the patch that breaks things. By moving df_defs_record()
*after* df_get_call_refs() most times collection_rec remains sorted, and
about 50M instructions are avoided in qsort()
calls of df_canonize_collection_rec().
Unfortunately this does not work. Sometimes cc1 crashes, for exampl
To document the gains from the bitmaps, here is (part of) the annotated
source from callgrind profiler, showing instruction count. Before:
1,154,400 if (bitmap_bit_p(regs_invalidated_by_call_regset, i)
8,080,800 => bitmap.c:bitmap_bit_p (192400x)
1,021,200 && !bitmap_bit_p (&d
Hello list,
The attached patch does two things for df_get_call_refs():
* First it uses HARD_REG_SETs for defs_generated and
regs_invalidated_by_call, instead of bitmaps. Replacing in total more
than 400K calls (for my testcase) to bitmap_bit_p() with the much faster
TEST_HARD_REG_BIT, reduces
Hi Nicola,
my patch is too simple compared to yours, feel free to work on it as much
as you wish, no need to credit me since you posted it independantly. I
just posted it to note that the inlining part is the one providing most
performance benefit.
richi: I used always_inline because it is t
FWIW I think that most of the speedup is due to inlining
lookup_attribute(). I got almost the same by applying only the attached
very simple patch, since strlen() was called too often (according to the
profile at [1]). I used the always_inline attribute to avoid using a
macro.
I was going to
On Thu, 21 Apr 2011, Laurynas Biveinis wrote:
:( Why don't you get yourself a compile farm account?
http://gcc.gnu.org/wiki/CompileFarm
Thanks Laurynas, I am absolutely thrilled to see such a variety of
hardware! I'll try applying, but I'm not sure I'm eligible, my
contributions to OSS are
On Wed, 20 Apr 2011, Jeff Law wrote:
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
On 04/20/11 15:08, Dimitrios Apostolou wrote:
Hello list,
while trying to build gcc-4.6.0 on my sparcstation, I got gengtype OOM
killed. That's when I noticed that its RAM usage peaks at 150MB, which
is
Hello list,
while trying to build gcc-4.6.0 on my sparcstation, I got gengtype OOM
killed. That's when I noticed that its RAM usage peaks at 150MB, which is
a bit excessive for parsing a ~500K text file.
The attached patch fixes the leak and gengtype now uses a peak of 4MB
heap. Hopefully I
95 matches
Mail list logo