> On Aug 17, 2021, at 9:45 AM, Richard Biener <rguent...@suse.de> wrote: > > On Tue, 17 Aug 2021, Qing Zhao wrote: > >> >> >>> On Aug 17, 2021, at 3:43 AM, Richard Biener <rguent...@suse.de> wrote: >>> >>> On Mon, 16 Aug 2021, Qing Zhao wrote: >>> >>>> >>>> >>>>> On Aug 16, 2021, at 2:40 AM, Richard Biener <rguent...@suse.de> wrote: >>>>> >>>>> On Thu, 12 Aug 2021, Qing Zhao wrote: >>>>> >>>>>> Hi, Richard, >>>>>> >>>>>> For RTL expansion of call to .DEFERRED_INIT, I changed my code per your >>>>>> suggestions like following: >>>>>> >>>>>> ====================== >>>>>> #define INIT_PATTERN_VALUE 0xFE >>>>>> static void >>>>>> expand_DEFERRED_INIT (internal_fn, gcall *stmt) >>>>>> { >>>>>> tree lhs = gimple_call_lhs (stmt); >>>>>> tree var_size = gimple_call_arg (stmt, 0); >>>>>> enum auto_init_type init_type >>>>>> = (enum auto_init_type) TREE_INT_CST_LOW (gimple_call_arg (stmt, 1)); >>>>>> bool is_vla = (bool) TREE_INT_CST_LOW (gimple_call_arg (stmt, 2)); >>>>>> >>>>>> tree var_type = TREE_TYPE (lhs); >>>>>> gcc_assert (init_type > AUTO_INIT_UNINITIALIZED); >>>>>> >>>>>> if (is_vla || (!can_native_interpret_type_p (var_type))) >>>>>> { >>>>>> /* If this is a VLA or the type of the variable cannot be natively >>>>>> interpreted, expand to a memset to initialize it. */ >>>>>> if (TREE_CODE (lhs) == SSA_NAME) >>>>>> lhs = SSA_NAME_VAR (lhs); >>>>>> tree var_addr = NULL_TREE; >>>>>> if (is_vla) >>>>>> var_addr = TREE_OPERAND (lhs, 0); >>>>>> else >>>>>> { >>>>>> TREE_ADDRESSABLE (lhs) = 1; >>>>>> var_addr = build_fold_addr_expr (lhs); >>>>>> } >>>>>> tree value = (init_type == AUTO_INIT_PATTERN) ? >>>>>> build_int_cst (unsigned_char_type_node, >>>>>> INIT_PATTERN_VALUE) : >>>>>> build_zero_cst (unsigned_char_type_node); >>>>>> tree m_call = build_call_expr (builtin_decl_implicit >>>>>> (BUILT_IN_MEMSET), >>>>>> 3, var_addr, value, var_size); >>>>>> /* Expand this memset call. */ >>>>>> expand_builtin_memset (m_call, NULL_RTX, TYPE_MODE (var_type)); >>>>>> } >>>>>> else >>>>>> { >>>>>> /* If this is not a VLA and the type of the variable can be natively >>>>>> interpreted, expand to assignment to generate better code. */ >>>>>> tree pattern = NULL_TREE; >>>>>> unsigned HOST_WIDE_INT total_bytes >>>>>> = tree_to_uhwi (TYPE_SIZE_UNIT (var_type)); >>>>>> >>>>>> if (init_type == AUTO_INIT_PATTERN) >>>>>> { >>>>>> unsigned char *buf = (unsigned char *) xmalloc (total_bytes); >>>>>> memset (buf, INIT_PATTERN_VALUE, total_bytes); >>>>>> pattern = native_interpret_expr (var_type, buf, total_bytes); >>>>>> gcc_assert (pattern); >>>>>> } >>>>>> >>>>>> tree init = (init_type == AUTO_INIT_PATTERN) ? >>>>>> pattern : >>>>>> build_zero_cst (var_type); >>>>>> expand_assignment (lhs, init, false); >>>>>> } >>>>>> } >>>>>> =========================== >>>>>> >>>>>> Now, I used “can_native_interpret_type_p (var_type)” instead of >>>>>> “use_register_for_decl (lhs)” to decide >>>>>> whether to use “memset” or use “assign” to expand this function. >>>>>> >>>>>> However, this exposed an bug that is very hard to be addressed: >>>>>> >>>>>> *******For the testing case: test suite/gcc.dg/uninit-I.c: >>>>>> >>>>>> /* { dg-do compile } */ >>>>>> /* { dg-options "-O2 -Wuninitialized" } */ >>>>>> >>>>>> int sys_msgctl (void) >>>>>> { >>>>>> struct { int mode; } setbuf; >>>>>> return setbuf.mode; /* { dg-warning "'setbuf\.mode' is used" } */ >>>>>> == >>>>>> >>>>>> ******the above auto var “setbuf” has “struct” type, which >>>>>> “can_native_interpret_type_p(var_type)” is false, therefore, >>>>>> Expanding this .DEFERRED_INIT call went down the “memset” expansion >>>>>> route. >>>>>> >>>>>> However, this structure type can be fitted into a register, therefore >>>>>> cannot be taken address anymore at this stage, even though I tried: >>>>>> >>>>>> TREE_ADDRESSABLE (lhs) = 1; >>>>>> var_addr = build_fold_addr_expr (lhs); >>>>>> >>>>>> To create an address variable for it, the expansion still failed at >>>>>> expr.c: line 8412: >>>>>> during RTL pass: expand >>>>>> /home/opc/Work/GCC/latest-gcc/gcc/testsuite/gcc.dg/auto-init-uninit-I.c:6:24: >>>>>> internal compiler error: in expand_expr_addr_expr_1, at expr.c:8412 >>>>>> 0xd04104 expand_expr_addr_expr_1 >>>>>> ../../latest-gcc/gcc/expr.c:8412 >>>>>> 0xd04a95 expand_expr_addr_expr >>>>>> ../../latest-gcc/gcc/expr.c:8525 >>>>>> 0xd13592 expand_expr_real_1(tree_node*, rtx_def*, machine_mode, >>>>>> expand_modifier, rtx_def**, bool) >>>>>> ../../latest-gcc/gcc/expr.c:11741 >>>>>> 0xd05142 expand_expr_real(tree_node*, rtx_def*, machine_mode, >>>>>> expand_modifier, rtx_def**, bool) >>>>>> ../../latest-gcc/gcc/expr.c:8713 >>>>>> 0xaed1d3 expand_expr >>>>>> ../../latest-gcc/gcc/expr.h:301 >>>>>> 0xaf0d89 get_memory_rtx >>>>>> ../../latest-gcc/gcc/builtins.c:1370 >>>>>> 0xafb4fb expand_builtin_memset_args >>>>>> ../../latest-gcc/gcc/builtins.c:4102 >>>>>> 0xafacde expand_builtin_memset(tree_node*, rtx_def*, machine_mode) >>>>>> ../../latest-gcc/gcc/builtins.c:3886 >>>>>> 0xe97fb3 expand_DEFERRED_INIT >>>>>> >>>>>> ******That’s the major reason why I chose “use_register_for_decl(lhs)” >>>>>> to decide “memset” expansion or “assign” expansion, “memset” expansion >>>>>> needs to take address of the variable, if the variable has been decided >>>>>> to fit into a register, then its address cannot taken anymore at this >>>>>> stage. >>>>>> >>>>>> ******using “can_native_interpret_type_p” did make the “pattern” >>>>>> generation part much cleaner and simpler, however, looks like it didn’t >>>>>> work correctly. >>>>>> >>>>>> Based on this, I’d like to keep my previous implementation by using >>>>>> “use_register_for_decl” to decide whether to take “memset” expansion or >>>>>> “assign” expansion. >>>>>> Therefore, I might still need to keep the “UGLY” implementation of >>>>>> generatting “pattern” constant for different types? >>>>>> >>>>>> Let me know your opinion on this. >>>>> >>>>> Hmm, I think you can use use_register_for_decl(lhs) to decide to use an >>>>> alternate type to generate the pattern (and feed to >>>>> can_native_interpret_type_p) by using >>>>> lang_hooks.type_for_mode (TYPE_MODE (TREE_TYPE (lhs))). >>>> >>>> Do you mean that the TYPE returned by “lang_hooks.type_for_mode(TYPE_MODE >>>> (TREE_TYPE (lhs))” will always satisfy “can_native_interpret_type_p”? >>>> Even for big structure types? >>> >>> I meant that for use_register_for_decl (lhs) the structures will be >>> always small and the structure type will have a mode that is not BLKmode >>> (but for example DImode for struct { int i; int j; }). >> >> Oh, I see. >> >>> >>>> i.e, >>>> >>>> tree var_type = TREE_TYPE(lhs); >>>> tree pattern = NULL_TREE; >>>> unsigned HOST_WIDE_INT total_bytes >>>> = tree_to_uhwi (TYPE_SIZE_UNIT (var_type)); >>>> >>>> If (use_register_for_decl(lhs)==false) >>>> { >>>> tree alt_type = lang_hooks.type_for_mode(TYPE_MODE(var_type), >>>> TYPE_UNSIGNED(var_type); >>>> If (can_native_interpret_type_p (alt_type)) >>>> { >>>> unsigned char *buf = (unsigned char *) xmalloc (total_bytes); >>>> memset (buf, INIT_PATTERN_VALUE, total_bytes); >>>> pattern = native_interpret_expr (alt_type, buf, total_bytes); >>>> gcc_assert (pattern); >>>> } >>>> else >>>> gcc_unreachable (); >>>> } >>>> >>>> ? >>>> Don’t quite understand here. Please clarify. >>> >>> For !use_register_for_decl you use memset already, but for >>> use_register_for_decl not all types satisfy can_native_interpret_type_p >>> (in particular all struct and union types). But when we use a >>> register for the decl then we can of course directly initialize the >>> register. >> >> So, you mean the following: >> >> If (is_vla || (!use_register_for_decl (lhs))) >> { >> /* expand as memset that is done currently. */ >> } >> else >> { >> tree alt_type = lang_hooks.type_for_mode(TYPE_MODE(var_type), >> TYPE_UNSIGNED(var_type); >> If (can_native_interpret_type_p (alt_type)) > > you can use var_type when it satisfies can_native_interpret_type_p
Yes, then under that situation, no VIEW_CONVERT is needed. > >> { >> unsigned char *buf = (unsigned char *) xmalloc (total_bytes); >> memset (buf, INIT_PATTERN_VALUE, total_bytes); >> pattern = native_interpret_expr (alt_type, buf, total_bytes); >> gcc_assert (pattern); >> } >> else >> gcc_unreachable (); >> >> tree init = (init_type == AUTO_INIT_PATTERN) ? >> pattern : >> build_zero_cst (alt_type); >> >> /* here build VIEW_CONVERT <alt_type> (lhs) = init; >> And then expand it. */ >> } >> >> ? >> >>> As said, it would be much cleaner (and maybe also easier) >>> to then simply expand the RTL directly rather than going through >>> expand_assignment. >> >> Dump questions here: >> >> 1. When building VIEW_CONVERT <alt_type> (lhs) = init and expand it: >> >> Is the following correct: >> >> lhs = build1 (VIEW_CONVERT_EXPR, alt_type, lhs); >> expand_assignment (lhs, init); > > yes, I think so. Okay. > >> >> which utility routines should be used to building the assignment and then >> expand it? >> >> Are the above utility routines correct? > > yes. > >> >>> It's also easier to directly see whether the >>> LHS is a MEM_P or a REG_P but then for the MEM_P case it's a bit >>> more "awkward" to use the easy way of the generic expand code >>> (esp. if we eventually want to emit actual calls to memset - do we?) >> >> If all these are guarded by “use_register_for_decl” already, the lhs should >> be fit into registers, >> Under such situation, I don’t think that we want to emit actual calls to >> memset. That’s too expensive. >> >>> >>>>> You can then >>>>> build the assignment from the pattern as >>>>> >>>>> VIEW_CONVERT <reg-type> (lhs) = pattern_cst; >>>> >>>> What’s the <reg_type> in the above? The type “alt_type” returned by >>>> “lang_hooks.type_for_mode(TREE_TYPE(lhs))? >>>> Do I need to build “MODIFY_EXPR” for the above? >>> >>> Yes, the lang_hook.type_for_mode result and no, you could go through >>> expand_assignment. >> >> Okay. So, still use “expand_assigment” to expand it? > > yes. Thanks a lot. Now, I am clean on this part -:) Qing > >>> >>>> lhs = build1 (VIEW_CONVERT_EXPR, alt_type, lhs); >>>> tree final = build2 (MODIFY_EXPR, TREE_TYPE (alt_type), lhs, pattern); >>>> >>>> Then how to expand this”final"? >>>>> >>>>> note that more RTL-expand-ish would be to simply expand 'lhs' and >>>>> see whether it's a REG_P or a MEM_P and decide based on that. >>>> >>>> You mean that the current RTL expansion will automatically expand LHS to >>>> memset route or assignment route based on whether >>>> LHS is a REG_P or MEM_P? I don’t need to explicitly code for >>>> “expand_builtin_memset” or “expand_assign”? >>> >>> No. But you are inside the expander for the internal function call >>> and this is expected to generate RTL. You can simply generate >>> RTL directly without "faking" new GENERIC calls or assignments and >>> expanding those. >>> >>> But lets not go there for now. >> >> Okay. I see now. >> >> Qing >>> >>> Richard. >> >> > > -- > Richard Biener <rguent...@suse.de> > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg, > Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)