On Fri, Nov 22, 2013 at 2:00 PM, Yufeng Zhang <yufeng.zh...@arm.com> wrote: > Hi, > > Currently the address legitimization (by calling memory_address_addr_space) > is carried out twice during the RTL expansion of ARRAY_REF, COMPONENT_REF, > etc. when their OFFSET is not NULL. It is done once for the BASE and once > for the summed address in offset_address. This may cause part, if not all, > of the generated BASE RTL to be forced into reg(s), preventing the RTL > generator from carrying out effective re-association across BASE and OFFSET > (via simplify_gen_binary). > > For example, given the following test case: > > typedef int arr_2[20][20]; > void foo (arr_2 a2, int i, int j) > { > a2[i+10][j] = 1; > } > > the RTL code for the BASE (i.e. a2[i+10]) on arm (-mcpu=cortex-a15, -mthumb) > is
TER makes us see *(a2_5(D) + i * 80 + 800)[j_8] here > * before the legitimization of BASE: > > (plus:SI (plus:SI (mult:SI (reg/v:SI 115 [ i ]) > (const_int 80 [0x50])) > (reg/v/f:SI 114 [ a2 ])) > (const_int 800 [0x320])) > (reg/f:SI 122) > > * after the legitimization of BASE: > > (reg/f:SI 122) > > "Thanks to" the initial legitimization, the RTL for the final address is > turned into: > > (plus:SI (mult:SI (reg/v:SI 116 [ j ]) > (const_int 4 [0x4])) > (reg/f:SI 122)) > > while with the legitimization deferred, the RTL for the final address could > be: > > (plus:SI (plus:SI (plus:SI (mult:SI (reg/v:SI 115 [ i ]) > (const_int 80 [0x50])) > (mult:SI (reg/v:SI 116 [ j ]) > (const_int 4 [0x4]))) > (reg/v/f:SI 114 [ a2 ])) > (const_int 800 [0x320])) > > which has more complete info in the RTL and is much more canonicalized; > later on it could open up more opportunities for CSE. > > The effect of this duplicated legitimization effort varies across different > targets, as it is strongly related to the available addressing modes on a > target. On RISC machines where in general there are fewer addressing modes > (which are in general less complicated as well), the RTL code quality can be > affected more adversely. > > The patch passes bootstrapping on arm and x86_64 and regtest on > arm-none-eabi, aarch64-none-elf and x86_64. There is no regression in > spec2000 on arm or x86_64. > > OK for the mainline? But this patch makes the path through expansion even harder to follow ... :/ I wonder if something along a "validate pass" after expansion would be a better solution (and thus never validate during expansion itself). That is, expand is a beast - don't add to it, try to simplify it instead ;) Thanks, Richard. > Thanks, > Yufeng > > > gcc/ > > * cfgexpand.c (expand_call_stmt): Update the call to > expand_expr_real_1. > * expr.c (expand_assignment): Add new local variable validate_p and > set > it; call expand_expr or expand_expr_nv depending on validate_p for > to_rtx; call adjust_address_1 instead of adjust_address. > (store_expr): Update the call to expand_expr_real. > (expand_expr_real): Add new parameter 'validate_p'; update the call > to > expand_expr_real_1. > (expand_expr_real_1): Add new parameter 'validate_p'; update the > call > to expand_expr_real; depending on validate_p, call > memory_address_addr_space or convert_memory_address_addr_space; > likewise for expand_expr or expand_expr_nv; call adjust_address_1 > instead of adjust_address. > * expr.h (expand_expr_real): Update. > (expand_expr_real_1): Update. > (expand_expr): Update. > (expand_expr_nv): New function. > (expand_normal): Update.