Re: About "STARTING_FRAME_OFFSET" definition
Hi, Thanks! I also find the doc describing the "HARD_FRAME_POINTER_REGNUM" and "FRAME_POINTER_REGNUM" in "gcc internals", chapter "Registers That Address the Stack Frame". It is really "usual" way to handle this similar cases. redriver 2010/3/24 Richard Henderson : > On 03/23/2010 05:55 AM, redriver jiang wrote: >> Hi all, >> >> Can this "STARTING_FRAME_OFFSET" macro be defined to be a non-constant >> value ( changes with the "current_function_args_size")? >> >> As the target process has "FP+offset" with postive "offset"( stack >> grows upward, and parameters in stack grows downward), for example, >> >> call foo( arg1, arg2, arg3,arg4), after foo's prologue, the stack is like >> this: >> >> < low address >> || >> | Incoming arg4 | <-FP >> || >> | Incoming arg3 | >> || >> | Incoming arg2 | >> || >> | Incoming arg1 | <---ARG >> || >> | return PC of foo | >> || >> | saved regs | >> || >> | old FP | >> || >> | local var0 | >> || >> < high address >> >> "STARTING_FRAME_OFFSET" means the offset between FP and the first >> local variable, in this situation, >> >> STARTING_FRAME_OFFSE = current_function_args_size+ size(PC in stack) + >> size(saved regs) + size(old FP). >> >> so, "STARTING_FRAME_OFFSET" depends on the >> "current_function_args_size", which is a GCC internal variable. >> >> Is this stack layout suitable? > > It's possible to create this stack layout, yes. > > STARTING_FRAME_OFFSET doesn't really ought not enter into it, I don't think. > > What you'll want instead is to have a separate "soft" frame_pointer_rtx > and hard_frame_pointer_rtx. Then during register allocation you eliminate > from the soft frame pointer to the hard frame pointer with an offset you > calculate at that point. There are many examples of this in existing ports, > including the i386 port. > > The reason why you want to handle this via elimination rather than a fixed > offset during initial rtl generation is your "saved regs" field there, which > of course will vary in size depending on what registers get spilled. > > So I would begin with STARTING_FRAME_OFFSET=0 and have the soft frame pointer > point to "local var0" in your picture. Then your INITIAL_ELIMINATION_OFFSET > function would map: > > ARG_POINTER_REGNUM HARD_FRAME_POINTER_REGNUM > = -current_function_args_size > > FRAME_POINTER_REGNUM HARD_FRAME_POINTER_REGNUM > = -(sizeof(saved_regs) + sizeof(FP) + sizeof(return PC) + > current_function_args_size) > > > > r~ >
Re: BB reorder forced off for -Os
On 23 Mar 2010, at 22:30, Steven Bosscher wrote: > On Tue, Mar 23, 2010 at 7:05 PM, Ian Bolton wrote: >> Is there any reason why BB reorder has been disabled >> in bb-reorder.c for -Os, such that you can't even >> turn it on with -freorder-blocks? > > No, you should have the option to turn it on if you wish to do so. If > that is not possible, I consider this a bug. If you open a PR and > assign it to me, I'll look into it. We're not able to enable BB reordering with -Os. The behaviour is hard-coded via this if statement in rest_of_handle_reorder_blocks(): if ((flag_reorder_blocks || flag_reorder_blocks_and_partition) /* Don't reorder blocks when optimizing for size because extra jump insns may be created; also barrier may create extra padding. More correctly we should have a block reordering mode that tried to minimize the combined size of all the jumps. This would more or less automatically remove extra jumps, but would also try to use more short jumps instead of long jumps. */ && optimize_function_for_speed_p (cfun)) { reorder_basic_blocks (); If you comment out the "&& optimize_function_for_speed_p (cfun)" then BB reordering takes places as desired (although this isn't a solution obviously). In a private message Ian indicated that this had a small impact for the ISA he's working with but a significant performance gain. I tried the same thing with the ISA I work on (Ubicom32) and this change typically increased code sizes by between 0.1% and 0.3% but improved performance by anything from 0.8% to 3% so on balance this is definitely winning for most of our users (this for a couple of benchmarks, the Linux kernel, busybox and smbd). Regards, Dave
Question about -Os handling of fold_builtin_strcpy()
Is there any obvious reason why this function doesn't enable folding if we're optimizing for size? All of the other string/memory folds appear to rely on the various move, clear or set ratios? When optimizing small string copies it's definitely smaller to do this inline on some ISAs. Thanks, Dave
Problem with ADDR_EXPR array offset
Hi everybody! I'm working on a pass and I need to handle some pointer expressions. For example, I have this C-code: int v[1000]; int *p; p = &v[10]; The problem is that, when I'm trying to parse "p = &v[10];", I'm not able to get the array offset (10 in this case). Namely, for a given statement like "p = &v[10]", I need to get: array: v (I can do that) offset: 10 Here is the code I am working on: op0 = gimple_op (stmt, 0); op1 = gimple_op (stmt, 1); if (gimple_assign_rhs_code (stmt) == ADDR_EXPR) { base = get_base_address (TREE_OPERAND (op1, 0)); // the array v, OK offset = // ??? How can I do? Max
Re: Problem with ADDR_EXPR array offset
Massimo Nazaria writes: > I'm working on a pass and I need to handle some pointer expressions. > For example, I have this C-code: > int v[1000]; > int *p; > p = &v[10]; > > The problem is that, when I'm trying to parse "p = &v[10];", I'm not able to > get the array offset (10 in this case). > > Namely, for a given statement like "p = &v[10]", I need to get: > array: v (I can do that) > offset: 10 > > Here is the code I am working on: > op0 = gimple_op (stmt, 0); > op1 = gimple_op (stmt, 1); > > if (gimple_assign_rhs_code (stmt) == ADDR_EXPR) > { > base = get_base_address (TREE_OPERAND (op1, 0)); // the array v, OK > offset = // ??? > > How can I do? There is nothing wrong with calling get_base_address, but by doing that you have thrown away the offset information. I would expected TREE_OPERAND (op1, 0) to be an ARRAY_REF. Operand 0 of that will be an array, operand 1 will be an index. I would recommend taking a look at get_inner_reference rather than get_base_address. Ian