On Wed, 18 May 2016, Prathamesh Kulkarni wrote: > On 17 May 2016 at 18:36, Richard Biener <rguent...@suse.de> wrote: > > On Wed, 11 May 2016, Prathamesh Kulkarni wrote: > > > >> On 6 May 2016 at 17:20, Richard Biener <rguent...@suse.de> wrote: > >> > > >> > You can't simply use > >> > > >> > + offset = int_byte_position (field); > >> > > >> > as it can be placed at variable offset which will make int_byte_position > >> > ICE. Note it also returns a truncated byte position (bit position > >> > stripped) which may be undesirable here. I think you want to use > >> > bit_position and if and only if DECL_FIELD_OFFSET and > >> > DECL_FIELD_BIT_OFFSET are INTEGER_CST. > >> oops, I didn't realize offsets could be variable. > >> Will that be the case only for VLA member inside struct ? > > > > And non-VLA members after such member. > > > >> > Your observation about the expensiveness of the walk still stands I guess > >> > and eventually you should at least cache the > >> > get_vec_alignment_for_record_decl cases. Please make those workers > >> > _type rather than _decl helpers. > >> Done > >> > > >> > You seem to simply get at the maximum vectorized field/array element > >> > alignment possible for all arrays - you could restrict that to > >> > arrays with at least vector size (as followup). > >> Um sorry, I didn't understand this part. > > > > It doesn't make sense to align > > > > struct { int a; int b; int c; int d; float b[3]; int e; }; > > > > because we have a float[3] member. There is no vector size that > > would cover the float[3] array. > Thanks for the explanation. > So we do not want to align struct if sizeof (array_field) < sizeof > (vector_type). > This issue is also present without patch for global arrays, so I modified > get_vec_alignment_for_array_type, to return 0 if sizeof (array_type) < > sizeof (vectype). > > > >> > > >> > + /* Skip artificial decls like typeinfo decls or if > >> > + record is packed. */ > >> > + if (DECL_ARTIFICIAL (record_decl) || TYPE_PACKED (type)) > >> > + return 0; > >> > > >> > I think we should honor DECL_USER_ALIGN as well and not mess with those > >> > decls. > >> Done > >> > > >> > Given the patch now does quite some extra work it might make sense > >> > to split the symtab part out of the vect_can_force_dr_alignment_p > >> > predicate and call that early. > >> In the patch I call symtab_node::can_increase_alignment_p early. I tried > >> moving that to it's callers - vect_compute_data_ref_alignment and > >> increase_alignment::execute, however that failed some tests in vect, and > >> hence I didn't add the following hunk in the patch. Did I miss some > >> check ? > > > > Not sure. > > > >> diff --git a/gcc/tree-vect-data-refs.c b/gcc/tree-vect-data-refs.c > >> index 7652e21..2c1acee 100644 > >> --- a/gcc/tree-vect-data-refs.c > >> +++ b/gcc/tree-vect-data-refs.c > >> @@ -795,7 +795,10 @@ vect_compute_data_ref_alignment (struct > >> data_reference *dr) > >> && TREE_CODE (TREE_OPERAND (base, 0)) == ADDR_EXPR) > >> base = TREE_OPERAND (TREE_OPERAND (base, 0), 0); > >> > >> - if (!vect_can_force_dr_alignment_p (base, TYPE_ALIGN (vectype))) > >> + if (!(TREE_CODE (base) == VAR_DECL > >> + && decl_in_symtab_p (base) > >> + && symtab_node::get (base)->can_increase_alignment_p () > >> + && vect_can_force_dr_alignment_p (base, TYPE_ALIGN > >> (vectype)))) > >> { > >> if (dump_enabled_p ()) > >> { > > > > + for (tree field = first_field (type); > > + field != NULL_TREE; > > + field = DECL_CHAIN (field)) > > + { > > + /* Skip if not FIELD_DECL or has variable offset. */ > > + if (TREE_CODE (field) != FIELD_DECL > > + || TREE_CODE (DECL_FIELD_OFFSET (field)) != INTEGER_CST > > + || TREE_CODE (DECL_FIELD_BIT_OFFSET (field)) != INTEGER_CST > > + || DECL_USER_ALIGN (field) > > + || DECL_ARTIFICIAL (field)) > > + continue; > > > > you can stop processing the type and return 0 here if the offset > > is not an INTEGER_CST. All following fields will have the same issue. > > > > + /* FIXME: is this check necessary since we check for variable > > offset above ? */ > > + if (TREE_CODE (offset_tree) != INTEGER_CST) > > + continue; > > > > the check should not be necessary. > > > > offset = tree_to_uhwi (offset_tree); > > > > but in theory offset_tree might not fit a unsigned HOST_WIDE_INT, so > > instead of for INTEGER_CST please check ! tree_fits_uhwi_p (offset_tree). > > As above all following fields will also fail the check if this fails so > > you can return 0 early. > > > > +static unsigned > > +get_vec_alignment_for_type (tree type) > > +{ > > + if (type == NULL_TREE) > > + return 0; > > + > > + gcc_assert (TYPE_P (type)); > > + > > + unsigned *slot = type_align_map->get (type); > > + if (slot) > > + return *slot; > > > > I suggest to apply the caching only for the RECORD_TYPE case to keep > > the size of the map low. > > > > Otherwise the patch looks ok now. > I have done the suggested changes in attached patch. > Is this version OK to commit after bootstrap+test ? > The patch survives cross-testing on arm*-*-* and aarch64*-*-*.
+ tree type_size_tree = TYPE_SIZE (type); + if (!type_size_tree) + return TYPE_ALIGN (vectype); + + if (TREE_CODE (type_size_tree) != INTEGER_CST) + return TYPE_ALIGN (vectype); + + /* If array has constant size, ensure that it's at least equal to + size of vector type. */ + if (!tree_fits_uhwi_p (type_size_tree)) + return TYPE_ALIGN (vectype); + unsigned HOST_WIDE_INT type_size = tree_to_uhwi (type_size_tree); + + tree vectype_size_tree = TYPE_SIZE (vectype); + if (!tree_fits_uhwi_p (vectype_size_tree)) + return TYPE_ALIGN (vectype); + + unsigned HOST_WIDE_INT vectype_size = tree_to_uhwi (vectype_size_tree); + return (type_size > vectype_size) ? TYPE_ALIGN (vectype) : 0; please change this to if (! TYPE_SIZE (type) || TREE_CODE (TYPE_SIZE (type)) != INTEGER_CST || tree_int_cst_lt (TYPE_SIZE (type), TYPE_SIZE (vectype))) return 0; return TYPE_ALIGN (vectype); consistent with the handling for "VLA" records. + unsigned *slot = type_align_map->get (type); + if (slot) + return *slot; + + unsigned max_align = 0, alignment; + HOST_WIDE_INT offset; + tree offset_tree; + + if (TYPE_PACKED (type)) + return 0; + move the tye_align_map query after the TYPE_PACKED check. + /* We don't need to process the type further if offset is variable, + since the offsets of remaining members will also be variable. */ + if (TREE_CODE (DECL_FIELD_OFFSET (field)) != INTEGER_CST + || TREE_CODE (DECL_FIELD_BIT_OFFSET (field)) != INTEGER_CST) + return 0; just break; here (and in the other case returning zero). If a previous record field said we'd want to align we still should align. Also we want to store 0 into the type_align_map as well. Ok with those changes. Thanks, Richard.