On Wed, Nov 26, 2014 at 5:46 AM, Andrew Pinski <pins...@gmail.com> wrote: > On Wed, Oct 29, 2014 at 6:34 AM, Zoran Jovanovic > <zoran.jovano...@imgtec.com> wrote: >> Hello, >> This is new patch version in which reported issue is fixed. >> Also, patch is rebased to the revision 216452 and some minor code clean-up >> is done. > > FYI. This causes gfc_add_interface_mapping in fortrant/trans-expr.c to > be miscompiled for aarch64-linux-gnu. I am still debugging it and > trying to get a smaller testcase.
Hello, is there any update on this? Thanks, Daniel. > > Thanks, > Andrew > >> >> -------------------------------------------------------------------------------------------------------------------------- >> >> Lowering is applied only for bit-fields copy sequences that are merged. >> Data structure representing bit-field copy sequences is renamed and reduced >> in size. >> Optimization turned on by default for -O2 and higher. >> Some comments fixed. >> >> Benchmarking performed on WebKit for Android. >> Code size reduction noticed on several files, best examples are: >> >> core/rendering/style/StyleMultiColData (632->520 bytes) >> core/platform/graphics/FontDescription (1715->1475 bytes) >> core/rendering/style/FillLayer (5069->4513 bytes) >> core/rendering/style/StyleRareInheritedData (5618->5346) >> core/css/CSSSelectorList(4047->3887) >> core/platform/animation/CSSAnimationData (3844->3440 bytes) >> core/css/resolver/FontBuilder (13818->13350 bytes) >> core/platform/graphics/Font (16447->15975 bytes) >> >> >> Example: >> >> One of the motivating examples for this work was copy constructor of the >> class which contains bit-fields. >> >> C++ code: >> class A >> { >> public: >> A(const A &x); >> unsigned a : 1; >> unsigned b : 2; >> unsigned c : 4; >> }; >> >> A::A(const A&x) >> { >> a = x.a; >> b = x.b; >> c = x.c; >> } >> >> GIMPLE code without optimization: >> >> <bb 2>: >> _3 = x_2(D)->a; >> this_4(D)->a = _3; >> _6 = x_2(D)->b; >> this_4(D)->b = _6; >> _8 = x_2(D)->c; >> this_4(D)->c = _8; >> return; >> >> Optimized GIMPLE code: >> <bb 2>: >> _10 = x_2(D)->D.1867; >> _11 = BIT_FIELD_REF <_10, 7, 0>; >> _12 = this_4(D)->D.1867; >> _13 = _12 & 128; >> _14 = (unsigned char) _11; >> _15 = _13 | _14; >> this_4(D)->D.1867 = _15; >> return; >> >> Generated MIPS32r2 assembly code without optimization: >> lw $3,0($5) >> lbu $2,0($4) >> andi $3,$3,0x1 >> andi $2,$2,0xfe >> or $2,$2,$3 >> sb $2,0($4) >> lw $3,0($5) >> andi $2,$2,0xf9 >> andi $3,$3,0x6 >> or $2,$2,$3 >> sb $2,0($4) >> lw $3,0($5) >> andi $2,$2,0x87 >> andi $3,$3,0x78 >> or $2,$2,$3 >> j $31 >> sb $2,0($4) >> >> Optimized MIPS32r2 assembly code: >> lw $3,0($5) >> lbu $2,0($4) >> andi $3,$3,0x7f >> andi $2,$2,0x80 >> or $2,$3,$2 >> j $31 >> sb $2,0($4) >> >> >> Algorithm works on basic block level and consists of following 3 major steps: >> 1. Go through basic block statements list. If there are statement pairs that >> implement copy of bit field content from one memory location to another >> record statements pointers and other necessary data in corresponding data >> structure. >> 2. Identify records that represent adjacent bit field accesses and mark them >> as merged. >> 3. Lower bit-field accesses by using new field size for those that can be >> merged. >> >> >> New command line option "-fmerge-bitfields" is introduced. >> >> >> Tested - passed gcc regression tests for MIPS32r2. >> >> >> Changelog - >> >> gcc/ChangeLog: >> 2014-04-22 Zoran Jovanovic (zoran.jovano...@imgtec.com) >> * common.opt (fmerge-bitfields): New option. >> * doc/invoke.texi: Add reference to "-fmerge-bitfields". >> * doc/invoke.texi: Add "-fmerge-bitfields" to the list of optimization >> flags turned on at -O2. >> * tree-sra.c (lower_bitfields): New function. >> Entry for (-fmerge-bitfields). >> (part_of_union_p): New function. >> (bf_access_candidate_p): New function. >> (lower_bitfield_read): New function. >> (lower_bitfield_write): New function. >> (bitfield_stmt_bfcopy_pair::hash): New function. >> (bitfield_stmt_bfcopy_pair::equal): New function. >> (bitfield_stmt_bfcopy_pair::remove): New function. >> (create_and_insert_bfcopy): New function. >> (get_bit_offset): New function. >> (add_stmt_bfcopy_pair): New function. >> (cmp_bfcopies): New function. >> (get_merged_bit_field_size): New function. >> * dwarf2out.c (simple_type_size_in_bits): Move to tree.c. >> (field_byte_offset): Move declaration to tree.h and make it extern. >> * testsuite/gcc.dg/tree-ssa/bitfldmrg1.c: New test. >> * testsuite/gcc.dg/tree-ssa/bitfldmrg2.c: New test. >> * tree-ssa-sccvn.c (expressions_equal_p): Move to tree.c. >> * tree-ssa-sccvn.h (expressions_equal_p): Move declaration to tree.h. >> * tree.c (expressions_equal_p): Move from tree-ssa-sccvn.c. >> (simple_type_size_in_bits): Move from dwarf2out.c. >> * tree.h (expressions_equal_p): Add declaration. >> (field_byte_offset): Add declaration. >> >> Patch - >> >> >> diff --git a/gcc/common.opt b/gcc/common.opt >> index 5db5e1e..cec145c 100644 >> --- a/gcc/common.opt >> +++ b/gcc/common.opt >> @@ -2270,6 +2270,10 @@ ftree-sra >> Common Report Var(flag_tree_sra) Optimization >> Perform scalar replacement of aggregates >> >> +fmerge-bitfields >> +Common Report Var(flag_tree_bitfield_merge) Optimization >> +Merge loads and stores of consecutive bitfields >> + >> ftree-ter >> Common Report Var(flag_tree_ter) Optimization >> Replace temporary expressions in the SSA->normal pass >> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi >> index 23f272f..d4205e1 100644 >> --- a/gcc/doc/invoke.texi >> +++ b/gcc/doc/invoke.texi >> @@ -421,7 +421,7 @@ Objective-C and Objective-C++ Dialects}. >> -fsplit-ivs-in-unroller -fsplit-wide-types -fssa-phiopt -fstack-protector >> @gol >> -fstack-protector-all -fstack-protector-strong -fstrict-aliasing @gol >> -fstrict-overflow -fthread-jumps -ftracer -ftree-bit-ccp @gol >> --ftree-builtin-call-dce -ftree-ccp -ftree-ch @gol >> +-fmerge-bitfields -ftree-builtin-call-dce -ftree-ccp -ftree-ch @gol >> -ftree-coalesce-inline-vars -ftree-coalesce-vars -ftree-copy-prop @gol >> -ftree-copyrename -ftree-dce -ftree-dominator-opts -ftree-dse @gol >> -ftree-forwprop -ftree-fre -ftree-loop-if-convert @gol >> @@ -7152,6 +7152,7 @@ also turns on the following optimization flags: >> -fipa-sra @gol >> -fipa-icf @gol >> -fisolate-erroneous-paths-dereference @gol >> +-fmerge-bitfields @gol >> -foptimize-sibling-calls @gol >> -foptimize-strlen @gol >> -fpartial-inlining @gol >> @@ -8133,6 +8134,12 @@ pointer alignment information. >> This pass only operates on local scalar variables and is enabled by default >> at @option{-O} and higher. It requires that @option{-ftree-ccp} is enabled. >> >> +@item -fmerge-bitfields >> +@opindex fmerge-bitfields >> +Combines several adjacent bit-field accesses that copy values >> +from one memory location to another into one single bit-field access. >> +This is enabled by default at @option{-O2} and higher. >> + >> @item -ftree-ccp >> @opindex ftree-ccp >> Perform sparse conditional constant propagation (CCP) on trees. This >> diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c >> index 8c65176..88c29b1 100644 >> --- a/gcc/dwarf2out.c >> +++ b/gcc/dwarf2out.c >> @@ -3205,8 +3205,6 @@ static HOST_WIDE_INT ceiling (HOST_WIDE_INT, unsigned >> int); >> static tree field_type (const_tree); >> static unsigned int simple_type_align_in_bits (const_tree); >> static unsigned int simple_decl_align_in_bits (const_tree); >> -static unsigned HOST_WIDE_INT simple_type_size_in_bits (const_tree); >> -static HOST_WIDE_INT field_byte_offset (const_tree); >> static void add_AT_location_description (dw_die_ref, enum >> dwarf_attribute, >> dw_loc_list_ref); >> static void add_data_member_location_attribute (dw_die_ref, tree); >> @@ -10413,25 +10411,6 @@ is_base_type (tree type) >> return 0; >> } >> >> -/* Given a pointer to a tree node, assumed to be some kind of a ..._TYPE >> - node, return the size in bits for the type if it is a constant, or else >> - return the alignment for the type if the type's size is not constant, or >> - else return BITS_PER_WORD if the type actually turns out to be an >> - ERROR_MARK node. */ >> - >> -static inline unsigned HOST_WIDE_INT >> -simple_type_size_in_bits (const_tree type) >> -{ >> - if (TREE_CODE (type) == ERROR_MARK) >> - return BITS_PER_WORD; >> - else if (TYPE_SIZE (type) == NULL_TREE) >> - return 0; >> - else if (tree_fits_uhwi_p (TYPE_SIZE (type))) >> - return tree_to_uhwi (TYPE_SIZE (type)); >> - else >> - return TYPE_ALIGN (type); >> -} >> - >> /* Similarly, but return an offset_int instead of UHWI. */ >> >> static inline offset_int >> @@ -14906,7 +14885,7 @@ round_up_to_align (const offset_int &t, unsigned int >> align) >> because the offset is actually variable. (We can't handle the latter >> case >> just yet). */ >> >> -static HOST_WIDE_INT >> +HOST_WIDE_INT >> field_byte_offset (const_tree decl) >> { >> offset_int object_offset_in_bits; >> diff --git a/gcc/opts.c b/gcc/opts.c >> index 3054196..ef3007b 100644 >> --- a/gcc/opts.c >> +++ b/gcc/opts.c >> @@ -500,6 +500,7 @@ static const struct default_options >> default_options_table[] = >> { OPT_LEVELS_2_PLUS, OPT_fipa_icf, NULL, 1 }, >> { OPT_LEVELS_2_PLUS, OPT_fisolate_erroneous_paths_dereference, NULL, 1 >> }, >> { OPT_LEVELS_2_PLUS, OPT_fuse_caller_save, NULL, 1 }, >> + { OPT_LEVELS_2_PLUS, OPT_fmerge_bitfields, NULL, 1 }, >> >> /* -O3 optimizations. */ >> { OPT_LEVELS_3_PLUS, OPT_ftree_loop_distribute_patterns, NULL, 1 }, >> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/bitfldmrg1.c >> b/gcc/testsuite/gcc.dg/tree-ssa/bitfldmrg1.c >> new file mode 100644 >> index 0000000..638db58 >> --- /dev/null >> +++ b/gcc/testsuite/gcc.dg/tree-ssa/bitfldmrg1.c >> @@ -0,0 +1,30 @@ >> +/* { dg-do compile } */ >> +/* { dg-options "-O2 -fmerge-bitfields -fdump-tree-esra" } */ >> + >> +struct S >> +{ >> + unsigned f1:7; >> + unsigned f2:9; >> + unsigned f3:3; >> + unsigned f4:5; >> + unsigned f5:1; >> + unsigned f6:2; >> +}; >> + >> +unsigned >> +foo (struct S *p1, struct S *p2, int *ptr) >> +{ >> + p2->f1 = p1->f1; >> + p2->f2 = p1->f2; >> + p2->f3 = p1->f3; >> + *ptr = 7; >> + p2->f4 = p1->f4; >> + p2->f5 = p1->f5; >> + p2->f6 = p1->f6; >> + return 0; >> +} >> + >> +/* Check if bit-field access is lowered. */ >> +/* { dg-final { scan-tree-dump "BIT_FIELD_REF" "esra" } } */ >> +/* { dg-final { cleanup-tree-dump "esra" } } */ >> + >> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/bitfldmrg2.c >> b/gcc/testsuite/gcc.dg/tree-ssa/bitfldmrg2.c >> new file mode 100644 >> index 0000000..a7b906f >> --- /dev/null >> +++ b/gcc/testsuite/gcc.dg/tree-ssa/bitfldmrg2.c >> @@ -0,0 +1,89 @@ >> +/* Check whether use of -fmerge-bitfields results in presence of >> + overlapping unions results in incorrect code. */ >> +/* { dg-options "-O2 -fmerge-bitfields" } */ >> +/* { dg-do run } */ >> +extern void abort (void); >> + >> +struct s1 >> +{ >> + unsigned f1:4; >> + unsigned f2:4; >> + unsigned f3:4; >> +}; >> + >> +struct s2 >> +{ >> + unsigned char c; >> + unsigned f1:4; >> + unsigned f2:4; >> + unsigned f3:4; >> +}; >> + >> +struct s3 >> +{ >> + unsigned f1:3; >> + unsigned f2:3; >> + unsigned f3:3; >> +}; >> + >> +struct s4 >> +{ >> + unsigned f0:3; >> + unsigned f1:3; >> + unsigned f2:3; >> + unsigned f3:3; >> +}; >> + >> +union un_1 >> +{ >> + struct s1 a; >> + struct s2 b; >> +}; >> + >> +union un_2 >> +{ >> + struct s3 a; >> + struct s4 b; >> +}; >> + >> +void f1 (union un_1 *p1, union un_1 *p2) >> +{ >> + p2->a.f3 = p1->b.f3; >> + p2->a.f2 = p1->b.f2; >> + p2->a.f1 = p1->b.f1; >> + >> + if (p1->a.f1 != 3) >> + abort (); >> +} >> + >> +void f2 (union un_2 *p1, union un_2 *p2) >> +{ >> + p2->b.f1 = p1->a.f1; >> + p2->b.f2 = p1->a.f2; >> + p2->b.f3 = p1->a.f3; >> + >> + if (p2->b.f1 != 0 || p2->b.f2 != 0 || p2->b.f3 != 0) >> + abort (); >> +} >> + >> +int main () >> +{ >> + union un_1 u1; >> + union un_2 u2; >> + >> + u1.b.f1 = 1; >> + u1.b.f2 = 2; >> + u1.b.f3 = 3; >> + u1.b.c = 0; >> + >> + f1 (&u1, &u1); >> + >> + u2.b.f0 = 0; >> + u2.b.f1 = 1; >> + u2.b.f2 = 2; >> + u2.b.f3 = 3; >> + >> + f2 (&u2, &u2); >> + >> + return 0; >> +} >> diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c >> index fb24114..60dd095 100644 >> --- a/gcc/tree-sra.c >> +++ b/gcc/tree-sra.c >> @@ -2243,7 +2243,7 @@ analyze_access_subtree (struct access *root, struct >> access *parent, >> if (INTEGRAL_TYPE_P (root->type) >> && (TREE_CODE (root->type) != INTEGER_TYPE >> || TYPE_PRECISION (root->type) != root->size) >> - /* But leave bitfield accesses alone. */ >> + /* But leave bit-field accesses alone. */ >> && (TREE_CODE (root->expr) != COMPONENT_REF >> || !DECL_BIT_FIELD (TREE_OPERAND (root->expr, 1)))) >> { >> @@ -3519,12 +3519,632 @@ perform_intra_sra (void) >> return ret; >> } >> >> +/* Bit-field copy sequences. */ >> + >> +struct bfcopy >> +{ >> + bfcopy ():merged (false), modified (false), is_barrier (false), next (0), >> + head_copy (0) >> + { >> + } >> + >> + gimple load_stmt; /* Bit-field load statement. */ >> + gimple store_stmt; /* Bit-field store statement. */ >> + unsigned src_offset_words; /* Bit-field offset at src in words. */ >> + unsigned src_bit_offset; /* Bit-field offset inside source word. */ >> + unsigned src_bit_size; /* Size of bit-field in source word. */ >> + unsigned dst_offset_words; /* Bit-field offset at dst in words. */ >> + unsigned dst_bit_offset; /* Bit-field offset inside destination >> + word. */ >> + unsigned src_field_offset; /* Source field offset. */ >> + unsigned dst_bit_size; /* Size of bit-field in destination word. */ >> + tree src_addr; /* Address of source memory access. */ >> + tree dst_addr; /* Address of destination memory access. */ >> + unsigned merged:1; /* 1 if copy is merged with another one. */ >> + unsigned modified:1; /* 1 if bit-field size is modified. */ >> + unsigned is_barrier:1; /* 1 if copy is barrier (call or mem >> + access). */ >> + struct bfcopy *next; /* Copy with which this one is merged. */ >> + tree bitfield_representative; /* Bit field representative of >> original >> + declaration. */ >> + struct bfcopy *head_copy; /* Head of copies list where this one is >> + merged. */ >> +}; >> + >> +typedef struct bfcopy *bfcopy_p; >> + >> +/* Returns true if given COMPONENT_REF is part of an union. */ >> + >> +static bool part_of_union_p (tree component) >> +{ >> + tree tmp = component; >> + bool res = false; >> + while (TREE_CODE (tmp) == COMPONENT_REF) >> + { >> + if (TREE_CODE (TREE_TYPE (tmp)) == UNION_TYPE) >> + { >> + res = true; >> + break; >> + } >> + tmp = TREE_OPERAND (tmp, 0); >> + } >> + if (tmp && (TREE_CODE (TREE_TYPE (tmp)) == UNION_TYPE)) >> + res = true; >> + return res; >> +} >> + >> +/* Return TRUE if REF is a bit-field access. If *OFF is not NULL it will >> + contain offset of the bit-field within the representative in bits. */ >> + >> +static bool >> +bf_access_candidate_p (tree ref, unsigned HOST_WIDE_INT * off) >> +{ >> + if (TREE_CODE (ref) != COMPONENT_REF) >> + return false; >> + if (part_of_union_p (ref)) >> + return false; >> + tree field = TREE_OPERAND (ref, 1); >> + if (!DECL_BIT_FIELD_TYPE (field)) >> + return false; >> + >> + tree rep = DECL_BIT_FIELD_REPRESENTATIVE (field); >> + if (!rep) >> + return false; >> + /* Do not lower if representative is bigger than one word. */ >> + if (TREE_CODE (TREE_TYPE (rep)) == ARRAY_TYPE) >> + return false; >> + >> + if (!off) >> + return true; >> + >> + if (tree_fits_uhwi_p (DECL_FIELD_OFFSET (field)) >> + && tree_fits_uhwi_p (DECL_FIELD_OFFSET (rep))) >> + *off = (tree_to_uhwi (DECL_FIELD_OFFSET (field)) >> + - tree_to_uhwi (DECL_FIELD_OFFSET (rep))) * BITS_PER_UNIT; >> + else >> + *off = 0; >> + *off += (tree_to_uhwi (DECL_FIELD_BIT_OFFSET (field)) >> + - tree_to_uhwi (DECL_FIELD_BIT_OFFSET (rep))); >> + >> + return true; >> +} >> + >> + >> +/* Lower the bit-field read at *GSI. */ >> + >> +static void >> +lower_bitfield_read (gimple_stmt_iterator * gsi, unsigned HOST_WIDE_INT off, >> + tree size, tree type) >> +{ >> + gimple stmt = gsi_stmt (*gsi); >> + tree ref = gimple_assign_rhs1 (stmt); >> + tree field = TREE_OPERAND (ref, 1); >> + tree rep = DECL_BIT_FIELD_REPRESENTATIVE (field); >> + tree loadres = make_ssa_name (TREE_TYPE (rep), NULL); >> + gimple load = gimple_build_assign (loadres, >> + build3 (COMPONENT_REF, TREE_TYPE (rep), >> + TREE_OPERAND (ref, 0), rep, >> + NULL_TREE)); >> + gimple_set_vuse (load, gimple_vuse (stmt)); >> + gsi_insert_before (gsi, load, GSI_SAME_STMT); >> + if (!type) >> + type = TREE_TYPE (ref); >> + gimple_assign_set_rhs1 (stmt, >> + build3 (BIT_FIELD_REF, type, >> + loadres, size, bitsize_int (off))); >> + update_stmt (stmt); >> +} >> + >> +/* Lower the bit-field write at *GSI. */ >> + >> +static void >> +lower_bitfield_write (gimple_stmt_iterator * gsi, unsigned HOST_WIDE_INT >> off, >> + tree size) >> +{ >> + gimple stmt = gsi_stmt (*gsi); >> + tree ref = gimple_assign_lhs (stmt); >> + tree field = TREE_OPERAND (ref, 1); >> + tree rep = DECL_BIT_FIELD_REPRESENTATIVE (field); >> + tree loadres = make_ssa_name (TREE_TYPE (rep), NULL); >> + gimple load = gimple_build_assign (loadres, >> + build3 (COMPONENT_REF, TREE_TYPE (rep), >> + unshare_expr >> + (TREE_OPERAND (ref, 0)), >> + rep, >> + NULL_TREE)); >> + gimple_set_vuse (load, gimple_vuse (stmt)); >> + gsi_insert_before (gsi, load, GSI_SAME_STMT); >> + /* Mask out bits. */ >> + tree masked = make_ssa_name (TREE_TYPE (rep), NULL); >> + tree mask = double_int_to_tree (TREE_TYPE (rep), >> + ~double_int::mask >> + (TREE_INT_CST_LOW (size)).lshift (off)); >> + gimple tems = gimple_build_assign_with_ops (BIT_AND_EXPR, >> + masked, loadres, mask); >> + gsi_insert_before (gsi, tems, GSI_SAME_STMT); >> + /* Zero-extend the value to representative size. */ >> + tree tem2; >> + if (!TYPE_UNSIGNED (TREE_TYPE (field))) >> + { >> + tem2 = make_ssa_name (unsigned_type_for (TREE_TYPE (field)), NULL); >> + tems = gimple_build_assign_with_ops (NOP_EXPR, tem2, >> + gimple_assign_rhs1 (stmt), >> + NULL_TREE); >> + gsi_insert_before (gsi, tems, GSI_SAME_STMT); >> + } >> + else >> + tem2 = gimple_assign_rhs1 (stmt); >> + tree tem = make_ssa_name (TREE_TYPE (rep), NULL); >> + tems = gimple_build_assign_with_ops (NOP_EXPR, tem, tem2, NULL_TREE); >> + gsi_insert_before (gsi, tems, GSI_SAME_STMT); >> + /* Shift the value into place. */ >> + if (off != 0) >> + { >> + tem2 = make_ssa_name (TREE_TYPE (rep), NULL); >> + tems = gimple_build_assign_with_ops (LSHIFT_EXPR, tem2, tem, >> + size_int (off)); >> + gsi_insert_before (gsi, tems, GSI_SAME_STMT); >> + } >> + else >> + tem2 = tem; >> + /* Merge masked loaded value and value. */ >> + tree modres = make_ssa_name (TREE_TYPE (rep), NULL); >> + gimple mod = gimple_build_assign_with_ops (BIT_IOR_EXPR, modres, >> + masked, tem2); >> + gsi_insert_before (gsi, mod, GSI_SAME_STMT); >> + /* Finally adjust the store. */ >> + gimple_assign_set_rhs1 (stmt, modres); >> + gimple_assign_set_lhs (stmt, >> + build3 (COMPONENT_REF, TREE_TYPE (rep), >> + TREE_OPERAND (ref, 0), rep, NULL_TREE)); >> + update_stmt (stmt); >> +} >> + >> +/* Connects statement with bit-field copy sequence to which that statement >> + belong. */ >> + >> +struct bitfield_stmt_bfcopy_pair >> +{ >> + gimple stmt; >> + bfcopy *copy; >> + bitfield_stmt_bfcopy_pair (gimple s, bfcopy * c):stmt (s), copy (c) >> + { >> + }; >> + /* hash_table support. */ >> + typedef bitfield_stmt_bfcopy_pair value_type; >> + typedef bitfield_stmt_bfcopy_pair compare_type; >> + static inline hashval_t hash (const bitfield_stmt_bfcopy_pair * const); >> + static inline int equal (const bitfield_stmt_bfcopy_pair * const, >> + const bitfield_stmt_bfcopy_pair * const); >> + static inline void remove (bitfield_stmt_bfcopy_pair *); >> +}; >> + >> +hashval_t >> +bitfield_stmt_bfcopy_pair::hash (const bitfield_stmt_bfcopy_pair * const a) >> +{ >> + return hashval_t (gimple_uid (a->stmt)); >> +} >> + >> +int >> +bitfield_stmt_bfcopy_pair::equal (const bitfield_stmt_bfcopy_pair * const a, >> + const bitfield_stmt_bfcopy_pair * const b) >> +{ >> + return a->stmt == b->stmt; >> +} >> + >> +void >> +bitfield_stmt_bfcopy_pair::remove (bitfield_stmt_bfcopy_pair * a) >> +{ >> + delete a; >> +} >> + >> +/* Create new bfcopy structure and add it to given bitfield_copies >> + htab. */ >> + >> +static bfcopy_p >> +create_and_insert_bfcopy (vec <bfcopy_p>*bitfield_copies) >> +{ >> + bfcopy_p copy = new bfcopy; >> + bitfield_copies->safe_push (copy); >> + return copy; >> +} >> + >> +/* Get offset of the field in bits from the start of the record. */ >> + >> +static inline HOST_WIDE_INT >> +get_bit_offset (tree decl) >> +{ >> + tree type = DECL_BIT_FIELD_TYPE (decl); >> + HOST_WIDE_INT bitpos_int; >> + >> + /* Must be a field and a bit-field. */ >> + gcc_assert (type && TREE_CODE (decl) == FIELD_DECL); >> + /* Bit position and decl size should be integer constants that can be >> + represented in a single HOST_WIDE_INT. */ >> + if (!tree_fits_uhwi_p (bit_position (decl)) >> + || !tree_fits_uhwi_p (DECL_SIZE (decl))) >> + return -1; >> + >> + bitpos_int = int_bit_position (decl); >> + return bitpos_int; >> +} >> + >> +/* Creates new bitfield_stmt_bfcopy_pair structure and adds it to given >> + htab. */ >> + >> +static bool >> +add_stmt_bfcopy_pair (hash_table < bitfield_stmt_bfcopy_pair > >> &bf_stmnt_cpy, >> + bfcopy * copy, gimple stmt) >> +{ >> + bitfield_stmt_bfcopy_pair p (stmt, copy); >> + bitfield_stmt_bfcopy_pair **slot = bf_stmnt_cpy.find_slot (&p, INSERT); >> + if (!*slot) >> + { >> + *slot = new bitfield_stmt_bfcopy_pair (stmt, copy); >> + return true; >> + } >> + return false; >> +} >> + >> +/* Passed to qsort. Compares two bfcopy records. */ >> + >> +static int >> +cmp_bfcopies (const void *p1, const void *p2) >> +{ >> + const bfcopy_p *ap1 = (const bfcopy_p*) p1; >> + const bfcopy_p *ap2 = (const bfcopy_p*) p2; >> + const bfcopy_p a1 = *ap1; >> + const bfcopy_p a2 = *ap2; >> + >> + if (DECL_UID (a1->bitfield_representative) - >> + DECL_UID (a2->bitfield_representative)) >> + return DECL_UID (a1->bitfield_representative) - >> + DECL_UID (a2->bitfield_representative); >> + >> + if (!expressions_equal_p (a1->src_addr, a2->src_addr)) >> + return a1->src_addr - a2->src_addr; >> + >> + if (!expressions_equal_p (a1->dst_addr, a2->dst_addr)) >> + return a1->dst_addr - a2->dst_addr; >> + >> + if (a1->src_offset_words - a2->src_offset_words) >> + return a1->src_offset_words - a2->src_offset_words; >> + >> + return a1->src_bit_offset - a2->src_bit_offset; >> +} >> + >> +/* Returns size of combined bitfields. Size cannot be larger than size >> + of largest directly accessible memory unit. */ >> + >> +static int >> +get_merged_bit_field_size (bfcopy * copy) >> +{ >> + bfcopy *tmp_copy = copy; >> + int size = 0; >> + >> + while (tmp_copy) >> + { >> + size += tmp_copy->src_bit_size; >> + tmp_copy = tmp_copy->next; >> + } >> + return size; >> +} >> + >> +/* Lower bit-field access to accesses to its >> + DECL_BIT_FIELD_REPRESENTATIVE. */ >> + >> +static void >> +lower_bitfields (void) >> +{ >> + basic_block bb; >> + hash_table<bitfield_stmt_bfcopy_pair> *bf_stmnt_cpy; >> + vec <bfcopy_p>bitfield_copies; >> + bfcopy_p copy; >> + bf_stmnt_cpy = new hash_table<bitfield_stmt_bfcopy_pair> (1); >> + >> + FOR_EACH_BB_FN (bb, cfun) >> + { >> + bf_stmnt_cpy->empty (); >> + tree prev_representative = NULL_TREE; >> + bitfield_copies.create (0); >> + >> + /* There are two passes, the first one identifies interesting >> + bit-field accesses and the second one actually lowers them. */ >> + for (gimple_stmt_iterator gsi = gsi_start_bb (bb); >> + !gsi_end_p (gsi); gsi_next (&gsi)) >> + { >> + gimple stmt = gsi_stmt (gsi); >> + >> + if (!gimple_assign_single_p (stmt) || gimple_has_volatile_ops (stmt)) >> + continue; >> + >> + tree ref = gimple_assign_rhs1 (stmt); >> + if (bf_access_candidate_p (ref, NULL)) >> + { >> + gimple use_stmt; >> + use_operand_p use; >> + tree op0 = TREE_OPERAND (ref, 0); >> + tree op1 = TREE_OPERAND (ref, 1); >> + >> + if (TREE_CODE (DECL_CONTEXT (op1)) == UNION_TYPE >> + || TREE_CODE (DECL_CONTEXT (op1)) == QUAL_UNION_TYPE) >> + continue; >> + >> + if (TREE_CODE (gimple_assign_lhs (stmt)) == SSA_NAME >> + && single_imm_use (gimple_assign_lhs (stmt), &use, >> + &use_stmt) && is_gimple_assign (use_stmt)) >> + { >> + tree uses_stmt_lhs = gimple_assign_lhs (use_stmt); >> + >> + if (bf_access_candidate_p (uses_stmt_lhs, NULL)) >> + { >> + tree use_op0 = TREE_OPERAND (uses_stmt_lhs, 0); >> + tree use_op1 = TREE_OPERAND (uses_stmt_lhs, 1); >> + tree use_repr = DECL_BIT_FIELD_REPRESENTATIVE (use_op1); >> + if (prev_representative >> + && (prev_representative != use_repr)) >> + { >> + /* If previous bfcopy has different >> + representative then barrier is needed >> + between it and new access. */ >> + copy = create_and_insert_bfcopy (&bitfield_copies); >> + copy->is_barrier = true; >> + } >> + prev_representative = use_repr; >> + /* Create new bit-field bfcopy structure. */ >> + copy = create_and_insert_bfcopy (&bitfield_copies); >> + /* Collect bfcopy data - load instruction. */ >> + copy->src_bit_size = tree_to_uhwi (DECL_SIZE (op1)); >> + copy->src_bit_offset = get_bit_offset (op1); >> + copy->src_offset_words = >> + field_byte_offset (op1) / UNITS_PER_WORD; >> + copy->src_field_offset = >> + tree_to_uhwi (DECL_FIELD_OFFSET (op1)); >> + copy->src_addr = op0; >> + copy->load_stmt = gsi_stmt (gsi); >> + /* Collect bfcopy data - store instruction. */ >> + copy->dst_bit_size = tree_to_uhwi (DECL_SIZE (use_op1)); >> + copy->dst_bit_offset = get_bit_offset (use_op1); >> + copy->dst_offset_words = >> + field_byte_offset (use_op1) / UNITS_PER_WORD; >> + copy->dst_addr = use_op0; >> + copy->store_stmt = use_stmt; >> + add_stmt_bfcopy_pair (*bf_stmnt_cpy, copy, stmt); >> + add_stmt_bfcopy_pair (*bf_stmnt_cpy, copy, use_stmt); >> + copy->bitfield_representative = use_repr; >> + } >> + } >> + } >> + >> + /* Insert barrier for merging if statement is function call or memory >> + access. */ >> + bitfield_stmt_bfcopy_pair csdata (stmt, NULL); >> + if (!bf_stmnt_cpy->find (&csdata) >> + && ((gimple_code (stmt) == GIMPLE_CALL) >> + || (gimple_has_mem_ops (stmt)))) >> + { >> + /* Create new bfcopy access structure. */ >> + copy = create_and_insert_bfcopy (&bitfield_copies); >> + /* Mark it as barrier. */ >> + copy->is_barrier = true; >> + } >> + } >> + >> + /* If there are no at least two sequences go to the next basic block. >> */ >> + if (bitfield_copies.length () <= 1) >> + { >> + bitfield_copies.release (); >> + continue; >> + } >> + vec <bfcopy_p>bitfield_copies_merge = vNULL; >> + /* Try to merge different bfcopy sequences. */ >> + for (int ix = 0; bitfield_copies.iterate (ix, ©); ix++) >> + { >> + bfcopy_p head_copy; >> + bfcopy_p mrg_copy; >> + bfcopy_p prev_copy; >> + >> + if (!bitfield_copies_merge.exists ()) >> + bitfield_copies_merge.create (0); >> + >> + if (!copy->is_barrier) >> + bitfield_copies_merge.safe_push (copy); >> + >> + if (!copy->is_barrier >> + && !(copy == bitfield_copies.last () >> + && !bitfield_copies_merge.is_empty ())) >> + continue; >> + >> + bitfield_copies_merge.qsort (cmp_bfcopies); >> + >> + head_copy = prev_copy = NULL; >> + int iy; >> + for (iy = 0; bitfield_copies_merge.iterate (iy, &mrg_copy); iy++) >> + { >> + if (head_copy >> + && expressions_equal_p (head_copy->src_addr, >> + mrg_copy->src_addr) >> + && expressions_equal_p (head_copy->dst_addr, >> + mrg_copy->dst_addr) >> + && prev_copy->src_offset_words >> + == mrg_copy->src_offset_words >> + && prev_copy->dst_offset_words >> + == mrg_copy->dst_offset_words >> + && prev_copy->src_bit_offset + prev_copy->src_bit_size >> + == mrg_copy->src_bit_offset >> + && prev_copy->dst_bit_offset + prev_copy->dst_bit_size >> + == mrg_copy->dst_bit_offset >> + && prev_copy->bitfield_representative >> + == mrg_copy->bitfield_representative) >> + { >> + /* Merge conditions are satisfied - merge copies. */ >> + mrg_copy->merged = true; >> + prev_copy->next = mrg_copy; >> + head_copy->modified = true; >> + prev_copy = mrg_copy; >> + mrg_copy->head_copy = head_copy; >> + } >> + else >> + head_copy = prev_copy = mrg_copy; >> + } >> + bitfield_copies_merge.release (); >> + bitfield_copies_merge = vNULL; >> + } >> + >> + tree reaching_vuse = NULL_TREE; >> + >> + for (gimple_stmt_iterator gsi = gsi_start_bb (bb); !gsi_end_p (gsi);) >> + { >> + gimple stmt = gsi_stmt (gsi); >> + >> + /* Fix vuse info if necessary. */ >> + if (gimple_has_mem_ops (stmt) && reaching_vuse != NULL_TREE) >> + { >> + gimple_set_vuse (stmt, reaching_vuse); >> + update_stmt (stmt); >> + } >> + >> + if (!gimple_assign_single_p (stmt) || gimple_has_volatile_ops (stmt)) >> + { >> + gsi_next (&gsi); >> + if (gimple_vdef (stmt)) >> + reaching_vuse = gimple_vdef (stmt); >> + continue; >> + } >> + >> + tree ref; >> + unsigned HOST_WIDE_INT off; >> + tree size; >> + bool deleted = false; >> + >> + /* Lower a bit-field read. */ >> + ref = gimple_assign_rhs1 (stmt); >> + if (bf_access_candidate_p (ref, &off)) >> + { >> + bfcopy *cc = NULL; >> + bitfield_stmt_bfcopy_pair st_cpy (stmt, NULL); >> + bitfield_stmt_bfcopy_pair *p_st_cpy; >> + p_st_cpy = bf_stmnt_cpy->find (&st_cpy); >> + unsigned HOST_WIDE_INT mrg_off; >> + if (p_st_cpy) >> + cc = p_st_cpy->copy; >> + >> + if (cc && (cc->merged || cc->modified)) >> + size = >> + build_int_cst (unsigned_type_node, >> + get_merged_bit_field_size (cc->head_copy ? >> + cc->head_copy : >> + cc)); >> + else >> + size = DECL_SIZE (TREE_OPERAND (ref, 1)); >> + if (cc && cc->merged >> + && bf_access_candidate_p (gimple_assign_rhs1 >> + (cc->head_copy->load_stmt), >> + &mrg_off)) >> + off = mrg_off; >> + if (cc && cc->merged) >> + { >> + tree head_rhs = gimple_assign_rhs1 >> (cc->head_copy->load_stmt); >> + switch (TREE_CODE (head_rhs)) >> + { >> + case COMPONENT_REF: >> + if (bf_access_candidate_p (head_rhs, &mrg_off)) >> + off = mrg_off; >> + break; >> + case BIT_FIELD_REF: >> + off = tree_to_uhwi (TREE_OPERAND (head_rhs, 2)); >> + break; >> + default: >> + break; >> + } >> + } >> + >> + if (cc && (cc->modified)) >> + { >> + tree tmp_ssa; >> + tree itype = make_node (INTEGER_TYPE); >> + TYPE_PRECISION (itype) = TREE_INT_CST_LOW (size); >> + fixup_unsigned_type (itype); >> + lower_bitfield_read (&gsi, off, size, itype); >> + tmp_ssa = >> + make_ssa_name (create_tmp_var (itype, NULL), >> cc->load_stmt); >> + gimple_assign_set_lhs (cc->load_stmt, tmp_ssa); >> + update_stmt (cc->load_stmt); >> + gimple_assign_set_rhs1 (cc->store_stmt, tmp_ssa); >> + update_stmt (cc->store_stmt); >> + } >> + else if (cc && cc->merged) >> + { >> + gsi_remove (&gsi, true); >> + deleted = true; >> + } >> + } >> + /* Lower a bit-field write. */ >> + ref = gimple_assign_lhs (stmt); >> + if (bf_access_candidate_p (ref, &off)) >> + { >> + bfcopy *cc = NULL; >> + bitfield_stmt_bfcopy_pair st_cpy (stmt, NULL); >> + bitfield_stmt_bfcopy_pair *p_st_cpy; >> + unsigned HOST_WIDE_INT mrg_off; >> + p_st_cpy = bf_stmnt_cpy->find (&st_cpy); >> + if (p_st_cpy) >> + cc = p_st_cpy->copy; >> + >> + if (cc && (cc->merged || cc->modified)) >> + size = >> + build_int_cst (unsigned_type_node, >> + get_merged_bit_field_size (cc->head_copy ? >> + cc->head_copy : >> + cc)); >> + else >> + size = DECL_SIZE (TREE_OPERAND (ref, 1)); >> + if (cc && cc->merged >> + && >> + bf_access_candidate_p (gimple_assign_lhs >> + (cc->head_copy->store_stmt), >> &mrg_off)) >> + off = mrg_off; >> + >> + if (cc && (cc->modified) && !(cc && cc->merged)) >> + lower_bitfield_write (&gsi, off, size); >> + else if (cc && cc->merged) >> + { >> + if (gimple_vdef (stmt)) >> + { >> + imm_use_iterator imm_iter; >> + gimple use_stmt; >> + use_operand_p use_p; >> + FOR_EACH_IMM_USE_STMT (use_stmt, imm_iter, >> + gimple_vdef (stmt)) >> + { >> + FOR_EACH_IMM_USE_ON_STMT (use_p, imm_iter) >> + SET_USE (use_p, reaching_vuse); >> + update_stmt (use_stmt); >> + } >> + } >> + >> + gsi_remove (&gsi, true); >> + deleted = true; >> + } >> + } >> + if (gimple_vdef (stmt) && !deleted) >> + reaching_vuse = gimple_vdef (stmt); >> + if (!deleted) >> + gsi_next (&gsi); >> + } >> + } >> + delete bf_stmnt_cpy; >> +} >> + >> /* Perform early intraprocedural SRA. */ >> static unsigned int >> early_intra_sra (void) >> { >> sra_mode = SRA_MODE_EARLY_INTRA; >> - return perform_intra_sra (); >> + unsigned int res = perform_intra_sra (); >> + if (flag_tree_bitfield_merge) >> + lower_bitfields (); >> + return res; >> } >> >> /* Perform "late" intraprocedural SRA. */ >> diff --git a/gcc/tree-ssa-sccvn.c b/gcc/tree-ssa-sccvn.c >> index 44656ea..cdb3850 100644 >> --- a/gcc/tree-ssa-sccvn.c >> +++ b/gcc/tree-ssa-sccvn.c >> @@ -4410,29 +4410,6 @@ get_next_value_id (void) >> return next_value_id++; >> } >> >> - >> -/* Compare two expressions E1 and E2 and return true if they are equal. */ >> - >> -bool >> -expressions_equal_p (tree e1, tree e2) >> -{ >> - /* The obvious case. */ >> - if (e1 == e2) >> - return true; >> - >> - /* If only one of them is null, they cannot be equal. */ >> - if (!e1 || !e2) >> - return false; >> - >> - /* Now perform the actual comparison. */ >> - if (TREE_CODE (e1) == TREE_CODE (e2) >> - && operand_equal_p (e1, e2, OEP_PURE_SAME)) >> - return true; >> - >> - return false; >> -} >> - >> - >> /* Return true if the nary operation NARY may trap. This is a copy >> of stmt_could_throw_1_p adjusted to the SCCVN IL. */ >> >> diff --git a/gcc/tree-ssa-sccvn.h b/gcc/tree-ssa-sccvn.h >> index ad99604..d5963b6 100644 >> --- a/gcc/tree-ssa-sccvn.h >> +++ b/gcc/tree-ssa-sccvn.h >> @@ -21,10 +21,6 @@ >> #ifndef TREE_SSA_SCCVN_H >> #define TREE_SSA_SCCVN_H >> >> -/* In tree-ssa-sccvn.c */ >> -bool expressions_equal_p (tree, tree); >> - >> - >> /* TOP of the VN lattice. */ >> extern tree VN_TOP; >> >> diff --git a/gcc/tree.c b/gcc/tree.c >> index 365e89c..8df9812 100644 >> --- a/gcc/tree.c >> +++ b/gcc/tree.c >> @@ -12286,4 +12286,44 @@ get_base_address (tree t) >> return t; >> } >> >> +/* Compare two expressions E1 and E2 and return true if they are equal. */ >> + >> +bool >> +expressions_equal_p (const_tree e1, const_tree e2) >> +{ >> + /* The obvious case. */ >> + if (e1 == e2) >> + return true; >> + >> + /* If only one of them is null, they cannot be equal. */ >> + if (!e1 || !e2) >> + return false; >> + >> + /* Now perform the actual comparison. */ >> + if (TREE_CODE (e1) == TREE_CODE (e2) >> + && operand_equal_p (e1, e2, OEP_PURE_SAME)) >> + return true; >> + >> + return false; >> +} >> + >> +/* Given a pointer to a tree node, assumed to be some kind of a ..._TYPE >> + node, return the size in bits for the type if it is a constant, or else >> + return the alignment for the type if the type's size is not constant, or >> + else return BITS_PER_WORD if the type actually turns out to be an >> + ERROR_MARK node. */ >> + >> +unsigned HOST_WIDE_INT >> +simple_type_size_in_bits (const_tree type) >> +{ >> + if (TREE_CODE (type) == ERROR_MARK) >> + return BITS_PER_WORD; >> + else if (TYPE_SIZE (type) == NULL_TREE) >> + return 0; >> + else if (tree_fits_uhwi_p (TYPE_SIZE (type))) >> + return tree_to_uhwi (TYPE_SIZE (type)); >> + else >> + return TYPE_ALIGN (type); >> +} >> + >> #include "gt-tree.h" >> diff --git a/gcc/tree.h b/gcc/tree.h >> index 45f127f..b903089 100644 >> --- a/gcc/tree.h >> +++ b/gcc/tree.h >> @@ -4068,6 +4068,7 @@ extern tree substitute_placeholder_in_expr (tree, >> tree); >> ((EXP) == 0 || TREE_CONSTANT (EXP) ? (EXP) \ >> : substitute_placeholder_in_expr (EXP, OBJ)) >> >> +extern unsigned HOST_WIDE_INT simple_type_size_in_bits (const_tree type); >> >> /* stabilize_reference (EXP) returns a reference equivalent to EXP >> but it can be used multiple times >> @@ -4184,6 +4185,11 @@ inlined_function_outer_scope_p (const_tree block) >> (TREE = function_args_iter_cond (&(ITER))) != NULL_TREE; >> \ >> function_args_iter_next (&(ITER))) >> >> + >> +/* In dwarf2out.c. */ >> +HOST_WIDE_INT >> +field_byte_offset (const_tree decl); >> + >> /* In tree.c */ >> extern unsigned crc32_string (unsigned, const char *); >> extern unsigned crc32_byte (unsigned, char); >> @@ -4340,6 +4346,7 @@ extern tree obj_type_ref_class (tree ref); >> extern bool types_same_for_odr (const_tree type1, const_tree type2); >> extern bool contains_bitfld_component_ref_p (const_tree); >> extern bool type_in_anonymous_namespace_p (const_tree); >> +extern bool expressions_equal_p (const_tree e1, const_tree e2); >> extern bool block_may_fallthru (const_tree); >> extern void using_eh_for_cleanups (void); >> extern bool using_eh_for_cleanups_p (void); >> >> >> Regards, >> Zoran >> >> >> >>> From: Andrew Pinski [pins...@gmail.com] >>> Sent: Thursday, October 16, 2014 4:09 AM >>> To: Zoran Jovanovic >>> Cc: gcc-patches@gcc.gnu.org; Richard Biener; Bernhard Reutner-Fischer >>> Subject: Re: [PATCH] Add a new option "-fmerge-bitfields" (patch / doc >>> inside) >>> >>> On Tue, Apr 22, 2014 at 3:28 AM, Zoran Jovanovic >>> <zoran.jovano...@imgtec.com> wrote: >>> > Hello, >>> > Updated doc/invoke.texi by stating that new option is enabled by default >>> > at -O2 and higher. >>> > Also, -fmerge-bitfields added to the list of optimization flags enabled >>> > by default at -O2 and higher. >>> >>> >>> With this patch applied gcc from SPEC 2006 ICEs on aarch64-linux-gnu. >>> Here is a reduced testcase: >>> typedef struct rtx_def *rtx; >>> struct rtx_def >>> { >>> unsigned int unchanging : 1; >>> unsigned int volatil : 1; >>> unsigned int in_struct : 1; >>> unsigned integrated : 1; >>> unsigned frame_related : 1; >>> }; >>> ix86_set_move_mem_attrs_1 (rtx x, rtx dstref) >>> { >>> ((x)->volatil) = ((dstref)->volatil); >>> ((x)->in_struct) = ((dstref)->in_struct); >>> ((x)->frame_related) = ((dstref)->frame_related); >>> ((x)->unchanging) = ((dstref)->unchanging); >>> } >>> --- CUT --- >>> >>> Thanks, >>> Andrew Pinski >>> -- Daniel F. Gutson Chief Engineering Officer, SPD San Lorenzo 47, 3rd Floor, Office 5 Córdoba, Argentina Phone: +54 351 4217888 / +54 351 4218211 Skype: dgutson LinkedIn: http://ar.linkedin.com/in/danielgutson