> > Well, there is still fallout from this change so I'm not convinced > this will stay > this way. Also we stream the function-decl that refers to these fields in
As far as I know there are two problems 1) problem with the thunk expansion not getting DECL_ARGUMENTS/DECL_RESULT addressed by this patch 2) ICE in gcc.dg/torture/pr8081.c Here the problem is in variable type of return value that gets streamed twice and we end up with duplicate. As mentioned earlier, I want to handle this by introducing variable_sized_function_type_p that will make those go specially the old way (with DECL_ARGUMENTS/DECL_RESULT in the global stream). I did not send the patch, because I think the variable sized parameters were handled incorrectly before my changes, too, and the fix is not so trivial. It just makes us not ICE the same way as before. The testcase is: /* { dg-do run } */ extern void abort (void); int main (int argc, char **argv) { int size = 10; typedef struct { char val[size]; } block; block a, b; block __attribute__((noinline)) retframe_block () { return *(block *) &b; } b.val[0] = 1; b.val[9] = 2; a=retframe_block (); if (a.val[0] != 1 || a.val[9] != 2) abort (); return 0; } So here the size is not parameter of the function, it is a local vairable (later turned into structure) that gets into local function stream. The type is: <record_type 0x7ffff7511930 block sizes-gimplified type_0 type_1 BLK size <var_decl 0x7ffff752f428 D.1738 type <integer_type 0x7ffff740d0a8 bitsizetype public unsigned TI size <integer_cst 0x7ffff73f7fc0 constant 128> unit size <integer_cst 0x7ffff73f7fe0 constant 16> align 128 symtab 0 alias set -1 canonical type 0x7ffff740d0a8 precision 128 min <integer_cst 0x7ffff740f000 0> max <integer_cst 0x7ffff73f7fa0 -1>> used unsigned ignored TI file t.c line 8 col 19 size <integer_cst 0x7ffff73f7fc0 128> unit size <integer_cst 0x7ffff73f7fe0 16> align 128 context <function_decl 0x7ffff7510f00 main> chain <var_decl 0x7ffff752f4c0 D.1739 type <integer_type 0x7ffff740d738 long int> used ignored DI file t.c line 10 col 20 size <integer_cst 0x7ffff73f7f40 constant 64> unit size <integer_cst 0x7ffff73f7f60 constant 8> align 64 context <function_decl 0x7ffff7510f00 main> chain <var_decl 0x7ffff752f558 D.1740>>> unit size <var_decl 0x7ffff752f2f8 D.1736 type <integer_type 0x7ffff740d000 sizetype public unsigned DI size <integer_cst 0x7ffff73f7f40 64> unit size <integer_cst 0x7ffff73f7f60 8> align 64 symtab 0 alias set -1 canonical type 0x7ffff740d000 precision 64 min <integer_cst 0x7ffff73f7f80 0> max <integer_cst 0x7ffff73f7000 18446744073709551615>> used unsigned ignored DI file t.c line 8 col 19 size <integer_cst 0x7ffff73f7f40 64> unit size <integer_cst 0x7ffff73f7f60 8> align 64 context <function_decl 0x7ffff7510f00 main> chain <var_decl 0x7ffff752f390 D.1737 type <integer_type 0x7ffff740d0a8 bitsizetype> used unsigned ignored TI file t.c line 8 col 19 size <integer_cst 0x7ffff73f7fc0 128> unit size <integer_cst 0x7ffff73f7fe0 16> align 128 context <function_decl 0x7ffff7510f00 main> chain <var_decl 0x7ffff752f428 D.1738>>> align 8 symtab 0 alias set -1 canonical type 0x7ffff7511738 fields <field_decl 0x7ffff752f098 val type <array_type 0x7ffff7511888 type <integer_type 0x7ffff740d3f0 char> sizes-gimplified type_1 BLK size <var_decl 0x7ffff752f428 D.1738> unit size <var_decl 0x7ffff752f2f8 D.1736> align 8 symtab 0 alias set -1 structural equality domain <integer_type 0x7ffff75117e0>> decl_0 BLK file t.c line 10 col 20 size <var_decl 0x7ffff752f428 D.1738> unit size <var_decl 0x7ffff752f2f8 D.1736> align 8 offset_align 128 offset <integer_cst 0x7ffff73f7f80 constant 0> bit offset <integer_cst 0x7ffff740f000 constant 0> context <record_type 0x7ffff7511738>> context <function_decl 0x7ffff7510f00 main> pointer_to_this <pointer_type 0x7ffff7511bd0> chain <type_decl 0x7ffff7530000 D.1726>> My understanding is that we end up with silently putting automatic variable D.1738 into global stream and we happen to not explode later. At stream in time we will have duplicated D.1738 parameters and it is only becuase the one in TYPE_SIZE is unused making us to not ICE. I also do not see how we can produce valid debug info for the nested function. Only scenario the patch will solve is where the size of type is given by the parm_decl alone, since parm_decl will go to global stream. If the size of array is something like parm_decl * 4, i think gimplifier will put in an temporary re-introducing the problem, but I did not double check. > the global section which means we have a layering violation. Which means > DECL_ARGUMENTS and DECL_RESULT should be moved to struct function? > Of course frontends may not be happy with that (given DECL_ARGUMENTS > is also used in function declarations) It is the same as DECL_INITIAL that is already considered to be part of function body by LTO, but it is not in DECL_STRUCT_FUNCTION because the way frontends builds it. I think we can modify frotnends to put those into function bodies, but it is bit harder. > > Please consider reverting these changes (at least the DECL_ARGUMENTS one). It is certainly more involved change than I hoped for. I am fine with a) fixing the two issues and getting it work b) reverting it and trying to get it working off-tree c) declaring the approach flawed and giving up. The change brings noticeable improvements for firefox memory footprint. Two important things has changed since then however 1) I now remove DECL_ARGUMENTS/DECL_RESULT for external declarations that is ortoghonal to this problem. 2) We now have 600000 instead of 3700000 function bodies to deal with at WPA stage as a result of early virtual function removal. If you are interested, I can get memory/streaming time stats for the change on the current mainline. > > > cgraph_get_body is > > there to load it for you when you need it. We are going to expand the > > function > > so it makes sense to get it. > > Then all DECL_ARGUMENTS/DECL_RESULT users need to be audited, no? Yes, I went through this and explained my findings in the original email (by greping DECL_ARGUMENT/DECL_RESULT and trying to understand for all the uses). Local gimple passes use them, but those are safe. For non-local things it is primarily dwarf2out that needs them also for abstract origins of functions where I explicitely introduced the abstract origin tracking for (there is already bug with DECL_INITIAL not being there) and this early cgraph machinery - thunks and clone matherialization. I tought I fixed thunks, but I pugged in only the return value issue (that is easy since DECL_RESULT is not always provided by the FE) Honza > > Richard. > > > The same is done by the passmanager when function is going to be expanded. > > Only > > difference here is that thunks do not go through the passmanager. > > > > I can drop in_lto_p (the function does nothing when body is already there) > > > > Honza