On Wed, Aug 27, 2014 at 12:16 PM, Ajit Kumar Agarwal <ajit.kumar.agar...@xilinx.com> wrote: > The cause of xmalloc occurring at times given below in Register Allocator > will not be caused only by the structure and changing the passed S as > template argument. > It depends on how the below structures is referenced or used. From the stack > trace I can see the live ranges creation is based on how the below structure > is referenced and Used.
Could you please show me an example of such different usages and references? > > Thanks & Regards > Ajit > > -----Original Message----- > From: gcc-ow...@gcc.gnu.org [mailto:gcc-ow...@gcc.gnu.org] On Behalf Of > Daniel Gutson > Sent: Wednesday, August 27, 2014 7:58 PM > To: gcc Mailing List > Subject: Possible LRA issue? > > Hi, > > I have a large codebase where at some point, there's a structure that > takes an unsigned integer template argument, and uses as the size of an > array, something like > > template <class T, size_t S> > struct Struct > { > typedef std::array<T, S> Chunk; > typedef std::list<Chunk> Content; > > Content c; > }; > > Changing the values of S alters significantly the compile time and memory > that the compiler takes. We use some large numbers there. > At some point, the compiler runs out of memory (xmalloc fails). I wondered > why, and did some analysis by debugging the 4.8.2 (same with 4.8.3), and did > the following experiment turning off all the optimizations (-fno-* and -O0): > I generated a report of xmalloc usage of two programs: one having S=10u, > and another with S=11u, just to see the difference of 1. > The report was generated as follows: I set a breakpoint at xmalloc, appending > a bt to a file. Then I found common stack traces and counted how many > xmallocs were called in one and another versions of the program (S=10u and > S=11u as mentioned above). > The difference were: > > a) Stack trace: > xmalloc | pool_alloc | create_live_range | mark_pseudo_live | > mark_regno_live | process_bb_lives | lra_create_live_ranges | lra | do_reload > | rest_of_handle_reload | execute_one_pass | execute_pass_list | > execute_pass_list | expand_function | output_in_order | compile | > finalize_compilation_unit | cp_write_global_declarations | compile_file | > do_compile | toplev_main > | __libc_start_main | _start | > > S=10u: 15 times > S=11u: 16 times > > > b) Stack trace: > xmalloc | lra_set_insn_recog_data | lra_get_insn_recog_data | > lra_update_insn_regno_info | lra_update_insn_regno_info | > lra_push_insn_1 | lra_push_insn | push_insns | lra_process_new_insns | > curr_insn_transform | lra_constraints | lra | do_reload | > rest_of_handle_reload | execute_one_pass | execute_pass_list | > execute_pass_list | expand_function | output_in_order | compile | > finalize_compilation_unit | cp_write_global_declarations | compile_file | > do_compile | toplev_main | __libc_start_main | _start | > > S=10u: 186 times > S=11u: 192 times > > c) Stack trace: > xmalloc | df_install_refs | df_refs_add_to_chains | df_insn_rescan | > emit_insn_after_1 | emit_pattern_after_noloc | emit_pattern_after_setloc | > emit_insn_after_setloc | try_split | split_insn | split_all_insns | > rest_of_handle_split_after_reload | execute_one_pass | execute_pass_list | > execute_pass_list | execute_pass_list | expand_function | output_in_order | > compile | finalize_compilation_unit | cp_write_global_declarations | > compile_file | do_compile | toplev_main | __libc_start_main | _start | > > S=10u: 617 times > S=11u: 619 times > > d) Stack trace: > xmalloc | df_install_refs | df_refs_add_to_chains | df_bb_refs_record | > df_scan_blocks | rest_of_handle_df_initialize | execute_one_pass | > execute_pass_list | execute_pass_list | expand_function | output_in_order | > compile | finalize_compilation_unit | cp_write_global_declarations | > compile_file | do_compile | toplev_main | __libc_start_main | _start | > > S=10u: 13223 times > S=11u: 13227 times > > e) Stack trace: > xmalloc | __GI__obstack_newchunk | bitmap_element_allocate | > bitmap_set_bit | update_lives | assign_hard_regno | assign_by_spills | > lra_assign | lra | do_reload | rest_of_handle_reload | execute_one_pass | > execute_pass_list | execute_pass_list | expand_function | output_in_order | > compile | finalize_compilation_unit | cp_write_global_declarations | > compile_file | do_compile | toplev_main | __libc_start_main | _start | > > S=10u: 0 times (never!) > S=11u: 1 > > Unfortunately I can't disclose the source code nor have the time to isolate a > piece of code reproducing the issue. > Some comments about the code: I don't do template metaprogramming depending > on S, but I do some for-range on the Content. > > I can extend the analysis to S=12 and compare with the previous values. > I thought to fix this myself but lack the time and background on theses > optimizations. Any hint? > I'm open to do more experiments if anybody asks me, or post -fdumps. > > I suspect that playing with gcc-min-heapsize and similar values this issue > could be worked around, but I'd like to know why just changing the size of an > array has such a consequence. > > Thanks! > > Daniel. > > -- > > Daniel F. Gutson > Chief Engineering Officer, SPD > > > San Lorenzo 47, 3rd Floor, Office 5 > > Córdoba, Argentina > > > Phone: +54 351 4217888 / +54 351 4218211 > > Skype: dgutson -- Daniel F. Gutson Chief Engineering Officer, SPD San Lorenzo 47, 3rd Floor, Office 5 Córdoba, Argentina Phone: +54 351 4217888 / +54 351 4218211 Skype: dgutson