On Tue, 17 May 2005, Mike Stump wrote:
On May 17, 2005, at 3:16 PM, Karel Gardas wrote:1) the most expensive seems to be comptypes -- at least from data L2 refill point of view (~17%)
2) comptypes is also the most CPU intensive operation since the most of time is spent there
I think comptypes can be sped up by canonicalizing types better, and also adding a conservative hash and checking it first.
Perhaps, anyway this is box with 1GB RAM. Now, I've just for fun used:
0) compiler params used were: -I../include --param ggc-min-expand=30 --param ggc-min-heapsize=4096 -Wall -D_REENTRANT -D_GNU_SOURCE -DPIC -fPIC -c
and the picture at least for 4.1.0 is completely different, see below, which means that for machine with small memory gcc misses L2 cache much more, about 529 CLK per one miss, also the top cache misses provider seems to be GC, second comptypes.
Cheers, Karel
CPU: AMD64 processors, speed 1802.33 MHz (estimated) Counted CPU_CLK_UNHALTED events (Cycles outside of halt state) with a unit mask of 0x00 (No unit mask) count 100000 Counted DATA_CACHE_MISSES events (Data cache misses) with a unit mask of 0x00 (No unit mask) count 1000 Counted ICACHE_MISSES events (Instruction cache misses) with a unit mask of 0x00 (No unit mask) count 1000 Counted DATA_CACHE_REFILLS_FROM_SYSTEM events (Data cache refills from system) with a unit mask of 0x1f (All cache states ) count 1000 CPU_CLK_UNHALT...|DATA_CACHE_MIS...|ICACHE_MISSES:...|DATA_CACHE_REF...| samples| %| samples| %| samples| %| samples| %| ------------------------------------------------------------------------ 5795921 100.000 3695597 100.000 2946594 100.000 1095111 100.000 cc1plus
CPU: AMD64 processors, speed 1802.33 MHz (estimated) Counted CPU_CLK_UNHALTED events (Cycles outside of halt state) with a unit mask of 0x00 (No unit mask) count 100000 Counted DATA_CACHE_MISSES events (Data cache misses) with a unit mask of 0x00 (No unit mask) count 1000 Counted ICACHE_MISSES events (Instruction cache misses) with a unit mask of 0x00 (No unit mask) count 1000 Counted DATA_CACHE_REFILLS_FROM_SYSTEM events (Data cache refills from system) with a unit mask of 0x1f (All cache states ) count 1000 samples % samples % samples % samples % symbol name 442873 7.6411 277095 7.4980 406 0.0138 210537 19.2252 gt_ggc_mx_lang_tree_node 357714 6.1718 297393 8.0472 341 0.0116 92100 8.4101 ggc_set_mark 208484 3.5971 364311 9.8580 48844 1.6576 88551 8.0860 comptypes 176284 3.0415 96291 2.6056 66753 2.2654 27903 2.5480 ggc_alloc_stat 158048 2.7269 188948 5.1128 26549 0.9010 13119 1.1980 lookup_fnfields_1 120791 2.0841 17681 0.4784 12771 0.4334 1178 0.1076 dfs_walk_all 101900 1.7581 8530 0.2308 4541 0.1541 1293 0.1181 record_reg_classes 97854 1.6883 28305 0.7659 9740 0.3306 5843 0.5336 walk_tree 80856 1.3951 6314 0.1709 33168 1.1256 990 0.0904 find_reloads 79626 1.3738 4311 0.1167 743 0.0252 640 0.0584 _cpp_lex_direct 75468 1.3021 64101 1.7345 22 7.5e-04 20321 1.8556 cp_tree_node_structure 60301 1.0404 7343 0.1987 6487 0.2202 2986 0.2727 splay_tree_splay_helper 57714 0.9958 41027 1.1102 4436 0.1505 16364 1.4943 ht_lookup_with_hash 56687 0.9780 7502 0.2030 313 0.0106 422 0.0385 _cpp_clean_line 51682 0.8917 71809 1.9431 1513 0.0513 21801 1.9908 compparms 51528 0.8890 65441 1.7708 10699 0.3631 4356 0.3978 lookup_field_1 51470 0.8880 41211 1.1151 20647 0.7007 17549 1.6025 tsubst 50100 0.8644 43384 1.1739 19750 0.6703 18065 1.6496 htab_find_slot_with_hash 49868 0.8604 91428 2.4740 2472 0.0839 41355 3.7763 push_to_top_level
-- Karel Gardas [EMAIL PROTECTED] ObjectSecurity Ltd. http://www.objectsecurity.com