Hi, PR 92486 shows that DSE, when seeing a "normal" gimple aggregate assignment coming from a C struct assignment and one a representing a folded memcpy, can kill the latter and keep in place only the former, which does not copy padding - at least when SRA decides to totally scalarize a least one of the aggregates (when not doing total scalarization, SRA cares about padding)
SRA would not totally scalarize an aggregate if it saw that it takes part in a gimple assignment which is a folded memcpy (see how type_changing_p is set in contains_vce_or_bfcref_p) but it doesn't because of the DSE decisions. I was asked to modify SRA to take padding into account - and to copy it around - when totally scalarizing, which is what the patch below does. I am not very happy about this, I am afraid it will lead to performance regressions, but this has all been discussed (see https://gcc.gnu.org/ml/gcc-patches/2019-12/msg01185.html and https://gcc.gnu.org/ml/gcc-patches/2020-01/msg00218.html). I tried to alleviate the problem by not only inserting accesses for padding but also by enlarging existing accesses whenever possible to extend over padding - the extended access would get copied when in the original IL an aggregate copy is replaced with SRA copies and a BIT_FIELD_REF would be generated to replace a scalar access to a part of the aggregate in the original IL. I have made it work in the sense that the patch passed bootstrap and testing (see the git branch refs/users/jamborm/heads/sra-later_total-bfr-20200127 or look at https://gcc.gnu.org/git/gitweb.cgi?p=gcc.git;a=shortlog;h=refs/users/jamborm/heads/sra-later_total-bfr-20200127 if you are interested), but this approach meant that each such extended replacement which was written to (so all of them) could potentially become only partially assigned to and so had to be marked as addressable and could not become gimple register, meaning that total scalarizatio would be creating addressable variables. Detecting such cases is not easy, it would mean introducing yet another type of write flag (written to exactly this access) and propagate that flag across assignment sub-accesses. So I decided that was not the way to go and instead only extended integer accesses, and that is what the atcg below does. Like in the previous attempt, whatever padding could not be covered by extending an access would be covered by extra artificial accesses. As you can see, it adds a little complexity to various places of the pass which are already not trivial, but hopefully it is manageable. Bootstrapped and tested on x86_64-linux, I'll curious about the feedback. Thanks, Martin 2020-01-27 Martin Jambor <mjam...@suse.cz> PR tree-optimization/92486 * tree-sra.c: Include langhooks.h (struct access): New fields reg_size and reg_acc_type. (dump_access): Print new fields. (acc_size): New function. (find_access_in_subtree): Use it, new parameter reg. (get_var_base_offset_size_access): Pass true to find_access_in_subtree. (create_access_1): Initialize reg_size. (create_artificial_child_access): Likewise. (create_total_scalarization_access): Likewise. (build_ref_for_model): Do not use model expr if reg_acc_size is non-NULL. (get_reg_access_replacement): New function. (verify_sra_access_forest): Adjust verification for presence of extended accesses covering padding. (analyze_access_subtree): Undo extension over padding if total scalarization failed, set grp_partial_lhs if we are going to introduce a partial store to the new replacement, do not ignore holes when totally scalarizing. (sra_type_for_size): New function. (total_scalarization_fill_padding): Likewise. (total_should_skip_creating_access): Use it. (totally_scalarize_subtree): Likewise. (sra_modify_expr): Use get_reg_access_replacement instead of get_access_replacement, adjust for reg_acc_type. (sra_modify_assign): Likewise. (load_assign_lhs_subreplacements): Pass false to find_access_in_subtree. testsuite/ * gcc.dg/tree-ssa/pr92486.c: New test. --- gcc/ChangeLog | 32 +++ gcc/testsuite/ChangeLog | 5 + gcc/testsuite/gcc.dg/tree-ssa/pr92486.c | 38 +++ gcc/tree-sra.c | 368 +++++++++++++++++++++--- 4 files changed, 396 insertions(+), 47 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr92486.c diff --git a/gcc/ChangeLog b/gcc/ChangeLog index 3541c6638f9..34c60e6f2a3 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,35 @@ +2020-01-26 Martin Jambor <mjam...@suse.cz> + + PR tree-optimization/92486 + * tree-sra.c: Include langhooks.h + (struct access): New fields reg_size and reg_acc_type. + (dump_access): Print new fields. + (acc_size): New function. + (find_access_in_subtree): Use it, new parameter reg. + (get_var_base_offset_size_access): Pass true to + find_access_in_subtree. + (create_access_1): Initialize reg_size. + (create_artificial_child_access): Likewise. + (create_total_scalarization_access): Likewise. + (build_ref_for_model): Do not use model expr if reg_acc_size is + non-NULL. + (get_reg_access_replacement): New function. + (verify_sra_access_forest): Adjust verification for presence of + extended accesses covering padding. + (analyze_access_subtree): Undo extension over padding if total + scalarization failed, set grp_partial_lhs if we are going to introduce + a partial store to the new replacement, do not ignore holes when + totally scalarizing. + (sra_type_for_size): New function. + (total_scalarization_fill_padding): Likewise. + (total_should_skip_creating_access): Use it. + (totally_scalarize_subtree): Likewise. + (sra_modify_expr): Use get_reg_access_replacement instead of + get_access_replacement, adjust for reg_acc_type. + (sra_modify_assign): Likewise. + (load_assign_lhs_subreplacements): Pass false to + find_access_in_subtree. + 2020-01-27 Martin Liska <mli...@suse.cz> PR driver/91220 diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog index 22a37dd1ab2..7ccb5098224 100644 --- a/gcc/testsuite/ChangeLog +++ b/gcc/testsuite/ChangeLog @@ -1,3 +1,8 @@ +2020-01-24 Martin Jambor <mjam...@suse.cz> + + PR tree-optimization/92486 + * gcc.dg/tree-ssa/pr92486.c: New test. + 2020-01-27 Martin Liska <mli...@suse.cz> PR target/93274 diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr92486.c b/gcc/testsuite/gcc.dg/tree-ssa/pr92486.c new file mode 100644 index 00000000000..77e84241eff --- /dev/null +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr92486.c @@ -0,0 +1,38 @@ +/* { dg-do run } */ +/* { dg-options "-O1" } */ + +struct s { + char c; + int i; +}; + +__attribute__((noipa)) +void f(struct s *p, struct s *q) +{ + struct s w; + + __builtin_memset(&w, 0, sizeof(struct s)); + w = *q; + + __builtin_memset(p, 0, sizeof(struct s)); + *p = w; +} + +int main() +{ + struct s x; + __builtin_memset(&x, 1, sizeof(struct s)); + + struct s y; + __builtin_memset(&y, 2, sizeof(struct s)); + + f(&y, &x); + + for (unsigned char *p = (unsigned char *)&y; + p < (unsigned char *)&y + sizeof(struct s); + p++) + if (*p != 1) + __builtin_abort (); + + return 0; +} diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c index ea8594db193..bda342ffdf9 100644 --- a/gcc/tree-sra.c +++ b/gcc/tree-sra.c @@ -99,6 +99,7 @@ along with GCC; see the file COPYING3. If not see #include "dbgcnt.h" #include "builtins.h" #include "tree-sra.h" +#include "langhooks.h" /* Enumeration of all aggregate reductions we can do. */ @@ -130,11 +131,16 @@ struct assign_link; struct access { - /* Values returned by `get_ref_base_and_extent' for each component reference - If EXPR isn't a component reference just set `BASE = EXPR', `OFFSET = 0', - `SIZE = TREE_SIZE (TREE_TYPE (expr))'. */ + /* Offset, size and base are values returned by `get_ref_base_and_extent' for + each component reference If EXPR isn't a component reference just set + `BASE = EXPR', `OFFSET = 0', `SIZE = TREE_SIZE (TREE_TYPE (expr))'. */ HOST_WIDE_INT offset; HOST_WIDE_INT size; + + /* If reg_acc_type is non-NULL, this is the size of the actual egister type + the access represented before it was extended to cover padding. Otherwise + must be equal to size. */ + HOST_WIDE_INT reg_size; tree base; /* Expression. It is context dependent so do not use it to create new @@ -144,6 +150,12 @@ struct access /* Type. */ tree type; + /* If non-NULL, this is the type that should be actually extracted or + inserted from/to the replacement decl when replacing accesses to the + individual field itself (as opposed to accesses created as part of + replacing aggregate copies which should use TYPE). */ + tree reg_acc_type; + /* The statement this access belongs to. */ gimple *stmt; @@ -391,10 +403,17 @@ dump_access (FILE *f, struct access *access, bool grp) print_generic_expr (f, access->base); fprintf (f, "', offset = " HOST_WIDE_INT_PRINT_DEC, access->offset); fprintf (f, ", size = " HOST_WIDE_INT_PRINT_DEC, access->size); + if (access->reg_size != access->size) + fprintf (f, ", reg_size = " HOST_WIDE_INT_PRINT_DEC, access->reg_size); fprintf (f, ", expr = "); print_generic_expr (f, access->expr); fprintf (f, ", type = "); print_generic_expr (f, access->type); + if (access->reg_acc_type) + { + fprintf (f, ", reg_acc_type = "); + print_generic_expr (f, access->type); + } fprintf (f, ", reverse = %d", access->reverse); if (grp) fprintf (f, ", grp_read = %d, grp_write = %d, grp_assignment_read = %d, " @@ -481,18 +500,27 @@ get_base_access_vector (tree base) return base_access_vec->get (base); } +/* Return ACCESS's size or reg_size depending on REG. */ + +static HOST_WIDE_INT +acc_size (struct access *access, bool reg) +{ + return reg ? access->reg_size : access->size; +} + /* Find an access with required OFFSET and SIZE in a subtree of accesses rooted in ACCESS. Return NULL if it cannot be found. */ static struct access * find_access_in_subtree (struct access *access, HOST_WIDE_INT offset, - HOST_WIDE_INT size) + HOST_WIDE_INT size, bool reg) { - while (access && (access->offset != offset || access->size != size)) + while (access && (access->offset != offset + || acc_size (access, reg) != size)) { struct access *child = access->first_child; - while (child && (child->offset + child->size <= offset)) + while (child && (child->offset + acc_size (child, reg) <= offset)) child = child->next_sibling; access = child; } @@ -503,7 +531,7 @@ find_access_in_subtree (struct access *access, HOST_WIDE_INT offset, if (access) while (access->first_child && access->first_child->offset == offset - && access->first_child->size == size) + && acc_size (access->first_child, reg) == size) access = access->first_child; return access; @@ -539,7 +567,7 @@ get_var_base_offset_size_access (tree base, HOST_WIDE_INT offset, if (!access) return NULL; - return find_access_in_subtree (access, offset, size); + return find_access_in_subtree (access, offset, size, true); } /* Add LINK to the linked list of assign links of RACC. */ @@ -868,6 +896,7 @@ create_access_1 (tree base, HOST_WIDE_INT offset, HOST_WIDE_INT size) access->base = base; access->offset = offset; access->size = size; + access->reg_size = size; base_access_vec->get_or_insert (base).safe_push (access); @@ -1667,6 +1696,7 @@ build_ref_for_model (location_t loc, tree base, HOST_WIDE_INT offset, { tree res; if (model->grp_same_access_path + && !model->reg_acc_type && !TREE_THIS_VOLATILE (base) && (TYPE_ADDR_SPACE (TREE_TYPE (base)) == TYPE_ADDR_SPACE (TREE_TYPE (model->expr))) @@ -2256,6 +2286,30 @@ get_access_replacement (struct access *access) return access->replacement_decl; } +/* Like above except in cases when ACCESS has non-NULL reg_acc_type, in which + case also construct BIT_FIELD_REF into the replacement of the corresponding + type (with location LOC). */ + +static tree +get_reg_access_replacement (location_t loc, struct access *access, bool write, + gassign **conversion) +{ + tree repl = get_access_replacement (access); + if (!access->reg_acc_type) + { + *conversion = NULL; + return repl; + } + + tree tmp = make_ssa_name (access->reg_acc_type); + if (write) + *conversion = gimple_build_assign (repl, NOP_EXPR, tmp); + else + *conversion = gimple_build_assign (tmp, NOP_EXPR, repl); + gimple_set_location (*conversion, loc); + return tmp; +} + /* Build a subtree of accesses rooted in *ACCESS, and move the pointer in the linked list along the way. Stop when *ACCESS is NULL or the access pointed @@ -2339,7 +2393,12 @@ verify_sra_access_forest (struct access *root) gcc_assert (offset == access->offset); gcc_assert (access->grp_unscalarizable_region || size == max_size); - gcc_assert (max_size == access->size); + if (access->reg_acc_type) + gcc_assert (max_size == access->reg_size + && max_size < access->size); + else + gcc_assert (max_size == access->size + && max_size == access->reg_size); gcc_assert (reverse == access->reverse); if (access->first_child) @@ -2405,6 +2464,39 @@ expr_with_var_bounded_array_refs_p (tree expr) return false; } +static void +upgrade_integral_size_to_prec (struct access *access) +{ + /* Always create access replacements that cover the whole access. + For integral types this means the precision has to match. + Avoid assumptions based on the integral type kind, too. */ + if (INTEGRAL_TYPE_P (access->type) + && (TREE_CODE (access->type) != INTEGER_TYPE + || TYPE_PRECISION (access->type) != access->size) + /* But leave bitfield accesses alone. */ + && (TREE_CODE (access->expr) != COMPONENT_REF + || !DECL_BIT_FIELD (TREE_OPERAND (access->expr, 1)))) + { + tree rt = access->type; + gcc_assert ((access->offset % BITS_PER_UNIT) == 0 + && (access->size % BITS_PER_UNIT) == 0); + access->type = build_nonstandard_integer_type (access->size, + TYPE_UNSIGNED (rt)); + access->expr = build_ref_for_offset (UNKNOWN_LOCATION, access->base, + access->offset, access->reverse, + access->type, NULL, false); + + if (dump_file && (dump_flags & TDF_DETAILS)) + { + fprintf (dump_file, "Changing the type of a replacement for "); + print_generic_expr (dump_file, access->base); + fprintf (dump_file, " offset: %u, size: %u ", + (unsigned) access->offset, (unsigned) access->size); + fprintf (dump_file, " to an integer.\n"); + } + } +} + /* Analyze the subtree of accesses rooted in ROOT, scheduling replacements when both seeming beneficial and when ALLOW_REPLACEMENTS allows it. If TOTALLY is set, we are totally scalarizing the aggregate. Also set all sorts of @@ -2488,6 +2580,15 @@ analyze_access_subtree (struct access *root, struct access *parent, hole = true; } + if (!totally && root->reg_acc_type) + { + /* If total scalarization did not eventually succeed, let's undo any + futile attempts to cover padding. */ + root->type = root->reg_acc_type; + root->size = root->reg_size; + root->reg_acc_type = NULL_TREE; + } + if (allow_replacements && scalar && !root->first_child && (totally || !root->grp_total_scalarization) && (totally @@ -2495,34 +2596,7 @@ analyze_access_subtree (struct access *root, struct access *parent, || ((root->grp_scalar_read || root->grp_assignment_read) && (root->grp_scalar_write || root->grp_assignment_write)))) { - /* Always create access replacements that cover the whole access. - For integral types this means the precision has to match. - Avoid assumptions based on the integral type kind, too. */ - if (INTEGRAL_TYPE_P (root->type) - && (TREE_CODE (root->type) != INTEGER_TYPE - || TYPE_PRECISION (root->type) != root->size) - /* But leave bitfield accesses alone. */ - && (TREE_CODE (root->expr) != COMPONENT_REF - || !DECL_BIT_FIELD (TREE_OPERAND (root->expr, 1)))) - { - tree rt = root->type; - gcc_assert ((root->offset % BITS_PER_UNIT) == 0 - && (root->size % BITS_PER_UNIT) == 0); - root->type = build_nonstandard_integer_type (root->size, - TYPE_UNSIGNED (rt)); - root->expr = build_ref_for_offset (UNKNOWN_LOCATION, root->base, - root->offset, root->reverse, - root->type, NULL, false); - - if (dump_file && (dump_flags & TDF_DETAILS)) - { - fprintf (dump_file, "Changing the type of a replacement for "); - print_generic_expr (dump_file, root->base); - fprintf (dump_file, " offset: %u, size: %u ", - (unsigned) root->offset, (unsigned) root->size); - fprintf (dump_file, " to an integer.\n"); - } - } + upgrade_integral_size_to_prec (root); root->grp_to_be_replaced = 1; root->replacement_decl = create_access_replacement (root); @@ -2554,7 +2628,7 @@ analyze_access_subtree (struct access *root, struct access *parent, root->grp_total_scalarization = 0; } - if (!hole || totally) + if (!hole) root->grp_covered = 1; else if (root->grp_write || comes_initialized_p (root->base)) root->grp_unscalarized_data = 1; /* not covered and written to */ @@ -2636,6 +2710,8 @@ create_artificial_child_access (struct access *parent, struct access *model, access->expr = expr; access->offset = new_offset; access->size = model->size; + gcc_assert (model->size == model->reg_size); + access->reg_size = model->reg_size; access->type = model->type; access->parent = parent; access->grp_read = set_grp_read; @@ -2966,6 +3042,7 @@ create_total_scalarization_access (struct access *parent, HOST_WIDE_INT pos, access->base = parent->base; access->offset = pos; access->size = size; + access->reg_size = size; access->expr = expr; access->type = type; access->parent = parent; @@ -3015,6 +3092,138 @@ create_total_access_and_reshape (struct access *parent, HOST_WIDE_INT pos, return new_acc; } +/* Given SIZE return the integer type to represent that many bits or NULL if + there is no suitable one. */ + +static tree +sra_type_for_size (HOST_WIDE_INT size, int unsignedp = 1) +{ + tree type = lang_hooks.types.type_for_size (size, unsignedp); + scalar_int_mode mode; + if (type + && is_a <scalar_int_mode> (TYPE_MODE (type), &mode) + && GET_MODE_BITSIZE (mode) == size) + return type; + else + return NULL_TREE; +} + +/* Create a series of accesses to cover padding from FROM to TO anch chain them + between LAST_PTR and NEXT_CHILD. Return the pointer to the last created + one. */ + +static struct access * +fill_padding_with_accesses (struct access *parent, HOST_WIDE_INT from, + HOST_WIDE_INT to, struct access **last_ptr, + struct access *next_child) +{ + struct access *access = NULL; + + gcc_assert (from < to); + do + { + HOST_WIDE_INT diff = to - from; + gcc_assert (diff >= BITS_PER_UNIT); + HOST_WIDE_INT stsz = 1 << floor_log2 (diff); + tree type; + + while (true) + { + type = sra_type_for_size (stsz); + if (type) + break; + stsz /= 2; + gcc_checking_assert (stsz >= BITS_PER_UNIT); + } + + do { + tree expr = build_ref_for_offset (UNKNOWN_LOCATION, parent->base, + from, parent->reverse, type, NULL, + false); + access = create_total_scalarization_access (parent, from, stsz, type, + expr, last_ptr, next_child); + access->grp_no_warning = 1; + from += stsz; + } + while (to - from >= stsz); + gcc_assert (from <= to); + } + while (from < to); + return access; +} + +/* Check whether any padding between FROM and TO must be filled during + total scalarization and if so, extend *LAST_SIBLING, create additional + artificial accesses under PARENT or both to fill it. Set *LAST_SIBLING to + the last created access and set its next_sibling to NEXT_CHILD. */ + +static void +total_scalarization_fill_padding (struct access *parent, + struct access **last_sibling, + HOST_WIDE_INT from, HOST_WIDE_INT to, + struct access *next_child) +{ + if (constant_decl_p (parent->base)) + return; + + gcc_assert (from <= to); + if (from == to) + return; + + if (struct access *ls = *last_sibling) + { + /* First, perform any upgrades to full integers we would have done + anyway. */ + upgrade_integral_size_to_prec (ls); + + HOST_WIDE_INT lastsize = ls->size; + if (INTEGRAL_TYPE_P (ls->type) + && pow2_or_zerop (lastsize)) + { + /* The last field was a reaonably sized integer, let's try to + enlarge as much as we can so that it is still naturally + aligned. */ + HOST_WIDE_INT lastpos = ls->offset; + HOST_WIDE_INT blocksize = lastsize; + tree type = NULL_TREE; + int unsignedp = TYPE_UNSIGNED (ls->type); + + while (true) + { + HOST_WIDE_INT b2 = blocksize * 2; + if (lastpos + b2 > to + || (lastpos % b2) != 0) + break; + tree t2 = sra_type_for_size (b2, unsignedp); + if (!t2) + break; + + blocksize = b2; + type = t2; + } + + if (blocksize != lastsize) + { + ls->reg_acc_type = ls->type; + ls->type = type; + ls->size = blocksize; + from = lastpos + blocksize; + } + + if (from == to) + return; + } + } + + struct access **last_ptr; + if (*last_sibling) + last_ptr = &(*last_sibling)->next_sibling; + else + last_ptr = &parent->first_child; + *last_sibling = fill_padding_with_accesses (parent, from, to, last_ptr, + next_child); +} + static bool totally_scalarize_subtree (struct access *root); /* Return true if INNER is either the same type as OUTER or if it is the type @@ -3073,10 +3282,17 @@ total_should_skip_creating_access (struct access *parent, HOST_WIDE_INT size) { struct access *next_child; + HOST_WIDE_INT covered; if (!*last_seen_sibling) - next_child = parent->first_child; + { + next_child = parent->first_child; + covered = parent->offset; + } else - next_child = (*last_seen_sibling)->next_sibling; + { + next_child = (*last_seen_sibling)->next_sibling; + covered = (*last_seen_sibling)->offset + (*last_seen_sibling)->size; + } /* First, traverse the chain of siblings until it points to an access with offset at least equal to POS. Check all skipped accesses whether they @@ -3085,10 +3301,16 @@ total_should_skip_creating_access (struct access *parent, { if (next_child->offset + next_child->size > pos) return TOTAL_FLD_FAILED; + total_scalarization_fill_padding (parent, last_seen_sibling, covered, + next_child->offset, next_child); + covered = next_child->offset + next_child->size; *last_seen_sibling = next_child; next_child = next_child->next_sibling; } + total_scalarization_fill_padding (parent, last_seen_sibling, covered, pos, + next_child); + /* Now check whether next_child has exactly the right POS and SIZE and if so, whether it can represent what we need and can be totally scalarized itself. */ @@ -3273,6 +3495,34 @@ totally_scalarize_subtree (struct access *root) } out: + /* Even though there is nothing in the type, there can still be accesses here + in the IL. */ + + HOST_WIDE_INT covered; + struct access *next_child; + + if (last_seen_sibling) + { + covered = last_seen_sibling->offset + last_seen_sibling->size; + next_child = last_seen_sibling->next_sibling; + } + else + { + covered = root->offset; + next_child = root->first_child; + } + + while (next_child) + { + total_scalarization_fill_padding (root, &last_seen_sibling, covered, + next_child->offset, next_child); + covered = next_child->offset + next_child->size; + last_seen_sibling = next_child; + next_child = next_child->next_sibling; + } + + total_scalarization_fill_padding (root, &last_seen_sibling, covered, + root->offset + root->size, NULL); return true; } @@ -3655,7 +3905,6 @@ sra_modify_expr (tree *expr, gimple_stmt_iterator *gsi, bool write) if (access->grp_to_be_replaced) { - tree repl = get_access_replacement (access); /* If we replace a non-register typed access simply use the original access expression to extract the scalar component afterwards. This happens if scalarizing a function return value or parameter @@ -3666,10 +3915,14 @@ sra_modify_expr (tree *expr, gimple_stmt_iterator *gsi, bool write) be accessed as a different type too, potentially creating a need for type conversion (see PR42196) and when scalarized unions are involved in assembler statements (see PR42398). */ - if (!useless_type_conversion_p (type, access->type)) + tree actual_type + = access->reg_acc_type ? access->reg_acc_type : access->type; + + if (!useless_type_conversion_p (type, actual_type)) { tree ref; + tree repl = get_access_replacement (access); ref = build_ref_for_model (loc, orig_expr, 0, access, gsi, false); if (write) @@ -3696,7 +3949,21 @@ sra_modify_expr (tree *expr, gimple_stmt_iterator *gsi, bool write) } } else - *expr = repl; + { + gassign *conversion; + tree repl = get_reg_access_replacement (loc, access, write, + &conversion); + *expr = repl; + + if (conversion) + { + if (write) + gsi_insert_after (gsi, conversion, GSI_NEW_STMT); + else + gsi_insert_before (gsi, conversion, GSI_SAME_STMT); + } + } + sra_stats.exprs++; } else if (write && access->grp_to_be_debug_replaced) @@ -3806,7 +4073,8 @@ load_assign_lhs_subreplacements (struct access *lacc, gassign *stmt; tree rhs; - racc = find_access_in_subtree (sad->top_racc, offset, lacc->size); + racc = find_access_in_subtree (sad->top_racc, offset, lacc->size, + false); if (racc && racc->grp_to_be_replaced) { rhs = get_access_replacement (racc); @@ -3857,7 +4125,7 @@ load_assign_lhs_subreplacements (struct access *lacc, tree drhs; struct access *racc = find_access_in_subtree (sad->top_racc, offset, - lacc->size); + lacc->size, false); if (racc && racc->grp_to_be_replaced) { @@ -4010,8 +4278,11 @@ sra_modify_assign (gimple *stmt, gimple_stmt_iterator *gsi) loc = gimple_location (stmt); if (lacc && lacc->grp_to_be_replaced) { - lhs = get_access_replacement (lacc); + gassign *lhs_conversion = NULL; + lhs = get_reg_access_replacement (loc, lacc, true, &lhs_conversion); gimple_assign_set_lhs (stmt, lhs); + if (lhs_conversion) + gsi_insert_after (gsi, lhs_conversion, GSI_NEW_STMT); modify_this_stmt = true; if (lacc->grp_partial_lhs) force_gimple_rhs = true; @@ -4020,7 +4291,10 @@ sra_modify_assign (gimple *stmt, gimple_stmt_iterator *gsi) if (racc && racc->grp_to_be_replaced) { - rhs = get_access_replacement (racc); + gassign *rhs_conversion = NULL; + rhs = get_reg_access_replacement (loc, racc, false, &rhs_conversion); + if (rhs_conversion) + gsi_insert_before (&orig_gsi, rhs_conversion, GSI_SAME_STMT); modify_this_stmt = true; if (racc->grp_partial_lhs) force_gimple_rhs = true; -- 2.24.1