> On Aug 19, 2021, at 8:54 AM, Qing Zhao via Gcc-patches > <gcc-patches@gcc.gnu.org> wrote: > >>>>> Breakpoint 1, expand_DEFERRED_INIT (stmt=0x7fffe96ae348) at >>>>> ../../latest-gcc/gcc/internal-fn.c:3021 >>>>> 3021 mark_addressable (lhs); >>>>> (gdb) call debug_tree(lhs) >>>>> <ssa_name 0x7fffe9584e58 >>>>> type <real_type 0x7fffe959b2a0 float sizes-gimplified SF >>>>> size <integer_cst 0x7fffe9579f48 constant 32> >>>>> unit-size <integer_cst 0x7fffe9579f60 constant 4> >>>>> align:32 warn_if_not_align:0 symtab:0 alias-set 2 canonical-type >>>>> 0x7fffe959b2a0 precision:32 >>>>> pointer_to_this <pointer_type 0x7fffe959b7e0>> >>>>> visited var <var_decl 0x7ffff7ff7bd0 temp1> >>>>> def_stmt temp1_5 = .DEFERRED_INIT (4, 2, 0, &"temp1"[0]); >>>>> version:5> >>>>> >>>>> when I deleted: >>>>> >>>>> if (TREE_CODE (lhs) == SSA_NAME >>>>> lhs = SSA_NAME_VAR (lhs); >>>> >>>> but then using SSA_NAME_VAR is broken. I suspect use_register_for_decl >>>> isn't the correct thing to look at. I think we need to look at what >>>> the LHS expanded to if it is a SSA_VAR_P (that includes SSA names >>>> but also plain DECLs but not what we get from VLAs where we'd see >>>> *ptr). So sth like >>>> >>>> bool reg_lhs; >>>> if (SSA_VAR_P (lhs)) >>>> { >>>> rtx tem = expand_expr (lhs, NULL_RTX, VOIDmode, EXPAND_WRITE); >>>> reg_lhs = !MEM_P (tem); >>>> /* If not MEM_P reg_lhs should be REG_P or SUBREG_P (but maybe >>>> also CONCAT or lowpart...?) */ >>>> } >>>> else >>>> { >>>> gcc_assert (is_vla); >>>> reg_lhs = false; >>>> } >>>> >>>> if (!reg_lhs) >>>> memset path >>>> else >>>> expand_assignment path >>> >>> After making the following change: >>> >>> + bool reg_lhs = true; >>> >>> tree var_type = TREE_TYPE (lhs); >>> gcc_assert (init_type > AUTO_INIT_UNINITIALIZED); >>> >>> - if (is_vla || (!use_register_for_decl (lhs))) >>> + if (SSA_VAR_P (lhs)) >>> + { >>> + rtx tem = expand_expr (lhs, NULL_RTX, VOIDmode, EXPAND_WRITE); >>> + reg_lhs = !MEM_P (tem); >>> + } >>> + else >>> + { >>> + gcc_assert (is_vla); >>> + reg_lhs = false; >>> + } >>> + >>> + if (!reg_lhs) >>> { >>> >>> I got exactly the same internal error that failed at expr.c: >>> >>> 8436 /* We must have made progress. */ >>> 8437 gcc_assert (inner != exp); >>> >>> >>> Looks like for the following code: >>> >>> 3026 if (!reg_lhs) >>> 3027 { >>> 3028 /* If this is a VLA or the variable is not in register, >>> 3029 expand to a memset to initialize it. */ >>> 3030 mark_addressable (lhs); >>> 3031 tree var_addr = build_fold_addr_expr (lhs); >>> 3032 >>> 3033 tree value = (init_type == AUTO_INIT_PATTERN) ? >>> 3034 build_int_cst (integer_type_node, >>> 3035 INIT_PATTERN_VALUE) : >>> 3036 integer_zero_node; >>> 3037 tree m_call = build_call_expr (builtin_decl_implicit >>> (BUILT_IN_MEMSET), >>> 3038 3, var_addr, value, var_size); >>> 3039 /* Expand this memset call. */ >>> 3040 expand_builtin_memset (m_call, NULL_RTX, TYPE_MODE (var_type)); >>> 3041 } >>> >>> At line 3030, “lhs” could be a SSA_NAME. >>> >>> My questions are: >>> >>> 1. Could the routine “mark_addressable” and “build_fold_addr_expr” be >>> applied on SSA_NAME? >> >> No. >> >>> 2. Could the routine “expand_builtin_memset” be applied on the memset call >>> whose “DEST” is >>> an address expression on SSA_NAME? >> >> No. >> >>> 3. Within “expand_DEFERRED_INIT”, can I call “expand_builtin_memset” to >>> expand .DEFERRED_INIT? >> >> Well, not with "invalid" GENERIC I fear (address of a SSA name). >> >>> I suspect that one of the above 3 might be the issue, but not sure which >>> one? >> >> All of the above ;) So while reg_lhs is now precise as to how the >> variable will end up (the SSA name will end up as a stack variable in this >> case, for whatever reason), expansion via memcpy only works when >> working on the RTL representation. The usual "workaround" (ugh) >> is to use make_tree (), so in the !reg_lhs path you'd do >> >> /* Get a new GENERIC representation for the RTL. That's necesary >> in case LHS is an SSA name. */ >> lhs = make_tree (TREE_TYPE (lhs), tem); > > This resolved the issue.
However, there was two runtime errors with CPU2017 with this change. I guess that there might be some issue with “expand_expr” + “make_tree” here. So, I changed the code to use the alternative solution that you suggested: > >> >> alternatively you could maybe do >> >> if (DECL_P (lhs)) >> { >> + rtx tem = expand_expr (lhs, NULL_RTX, VOIDmode, EXPAND_WRITE); >> + reg_lhs = !MEM_P (tem); >> } >> else if (TREE_CODE (lhs) == SSA_NAME) >> reg_lhs = true; >> else >> reg_lhs = false; >> >> thus treat SSA names as register storage always (even if it will end >> up on the stack). This did resolve all the issues including CPU2017 runtime errors. With this solution, all SSA_NAMEs will be expanded through “expand_assignment” even though they are in stack. the generated code will be correct. But the performance might be a little bit different than through memset. Qing > > My question here, for a complicate structure SSA_NAME, will expanding through > memset better than expand_asssignment? > > Qing >>