On Wed, Aug 21, 2019 at 10:10 PM Martin Sebor <mse...@gmail.com> wrote:
>
> On 8/20/19 1:26 AM, Richard Biener wrote:
> > On Tue, Aug 20, 2019 at 4:32 AM Martin Sebor <mse...@gmail.com> wrote:
> >>
> >> On 8/19/19 8:10 AM, Richard Biener wrote:
> >>> On Sat, Aug 17, 2019 at 12:43 AM Martin Sebor <mse...@gmail.com> wrote:
> >>>>
> >>>> With the recent enhancement to the strlen handling of multibyte
> >>>> stores the g++.dg/warn/Warray-bounds-4.C for zero-length arrays
> >>>> started failing on hppa (and probably elsewhere as well).  This
> >>>> is partly the result of the added detection of past-the-end
> >>>> writes into the strlen pass which detects more instances of
> >>>> the problem than -Warray-bounds.  Since the IL each warning
> >>>> works with varies between targets, the same invalid code can
> >>>> be diagnosed by one warning one target and different warning
> >>>> on another.
> >>>>
> >>>> The attached patch does three things:
> >>>>
> >>>> 1) It enhances compute_objsize to also determine the size of
> >>>> a flexible array member (and its various variants), including
> >>>> from its initializer if necessary.  (This resolves 91457 but
> >>>> introduces another warning where was previously just one.)
> >>>> 2) It guards the new instance of -Wstringop-overflow with
> >>>> the no-warning bit on the assignment to avoid warning on code
> >>>> that's already been diagnosed.
> >>>> 3) It arranges for -Warray-bounds to set the no-warning bit on
> >>>> the enclosing expression to keep -Wstringop-overflow from issuing
> >>>> another warning for the same problem.
> >>>>
> >>>> Testing the compute_objsize enhancement to bring it up to par
> >>>> with -Warray-bounds in turn exposed a weakness in the latter
> >>>> warning for flexible array members.  Rather than snowballing
> >>>> additional improvements into this one I decided to put that
> >>>> off until later, so the new -Warray-bounds test has a bunch
> >>>> of XFAILs.  I'll see if I can find the time to deal with those
> >>>> either still in stage 1 or in stage 3 (one of them is actually
> >>>> an ancient regression).
> >>>
> >>> +static tree
> >>> +get_initializer_for (tree init, tree decl)
> >>> +{
> >>>
> >>> can't you use fold_ctor_reference here?
> >>
> >> Yes, but only with an additional enhancement.  Char initializers
> >> for flexible array members aren't transformed to STRING_CSTs yet,
> >> so without the size of the initializer specified, the function
> >> returns the initializer for the smallest subobject, or char in
> >> this case.  I've enhanced the function to handle them.
> >
> > So at the moment it returns an empty array constructor, correct?
> > Isn't that the more canonical representation?  The STRING_CST
> > index type doesn't really handle "empty" strings and I expect code
> > more confused about that than about an empty CTOR?
>
> Yes.  Returning an empty CTOR is more general than an empty
> string and enables more optimizations.  It requires changing
> the caller(s) that look for a string but I think that's fine.
> Thanks for the hint!
>
> >>> +/* Determine the size of the flexible array FLD from the initializer
> >>> +   expression for the struct object DECL in which the meber is declared
> >>> +   (possibly recursively).  Return the size or zero constant if it isn't
> >>> +   initialized.  */
> >>> +
> >>> +static tree
> >>> +get_flexarray_size (tree decl, tree fld)
> >>> +{
> >>> +  if (tree init = DECL_INITIAL (decl))
> >>> +    {
> >>> +      init = get_initializer_for (init, fld);
> >>> +      if (init)
> >>> +       return TYPE_SIZE_UNIT (TREE_TYPE (init));
> >>> +    }
> >>> +
> >>> +  return integer_zero_node;
> >>>
> >>> so you're hoping that the (sub-)CONSTRUCTOR get_initializer_for
> >>> returns has a complete type but the initialized object didn't get it
> >>> completed.  Isnt that wishful thinking?
> >>
> >> I don't know what you mean.  When might a CONSTRUCTOR not have
> >> a complete type, and if/when it doesn't, why would that be
> >> a problem here?  TYPE_SIZE_UNIT will evaluate to null meaning
> >> "don't know" and that's fine.  Could you try to be more specific
> >> about the problem you're pointing out?
> >>
> >>> And why return integer_zero_node
> >>> rather than NULL_TREE here?
> >>
> >> Because the size of a flexible array member with no initializer
> >> is zero.
> >>
> >>>
> >>> +  if (TREE_CODE (dest) == COMPONENT_REF)
> >>> +    {
> >>> +      *pdecl = TREE_OPERAND (dest, 1);
> >>> +
> >>> +      /* If the member has a size return it.  Otherwise it's a flexible
> >>> +        array member.  */
> >>> +      if (tree size = DECL_SIZE_UNIT (*pdecl))
> >>> +       return size;
> >>>
> >>> because here you do.
> >>
> >> Not sure what you mean here either.  (This code was also a bit
> >
> > You return NULL_TREE.
> >
> >> out of date WRT to the patch I had tested.  Not sure how that
> >> happened.  The attached patch is up to date.)
> >>
> >>>
> >>> Also once you have an underlying VAR_DECL you can compute
> >>> the flexarray size by DECL_SIZE (var) - offset-of flexarray member.
> >>> Isn't that way cheaper than walking the initializer (possibly many
> >>> times?)
> >>
> >> It would be nice if it were this easy.  Is the value of DECL_SIZE
> >> (var) supposed to include the size of the flexible array member?
> >
> > Yes, DECL_SIZE of the VAR_DECL is the size we use for assembling.
> > It is usually only the types that remain incomplete (or too small) since
> > the FE doesn't create many variants of a struct S { int n; char x[]; }
> > when "instantiating" it via struct S s = { 5, "abcde" };  that then holds
> > true for the DECL_SIZE of the FIELD_DECL of x as well.
> >
> >> I don't see it mentioned in the comments in tree.h and in my tests
> >> it only does in C but not in C++.  Is that a bug that in C++ it
> >> doesn't?
> >
> > tree.h dcoumentation isn't complete.  And for me the C++ FE for
> > the above simple example has
> >
> >   <var_decl 0x7ffff7fedb40 s
> >      type <record_type 0x7ffff6d94f18 S cxx-odr-p type_5 type_6 BLK
> >          size <integer_cst 0x7ffff6c5e0a8 constant 32>
> >          unit-size <integer_cst 0x7ffff6c5e0c0 constant 4>
> >          align:32 warn_if_not_align:0 symtab:0 alias-set -1
> > canonical-type 0x7ffff6d94f18
> >          fields <function_decl 0x7ffff6d98b00 __dt  type <method_type
> > 0x7ffff6da8150>
> >              public abstract external autoinline decl_3 QI t.ii:1:8
> > align:16 warn_if_not_align:0 context <record_type 0x7ffff6d94f18 S>
> >              full-name "S::~S() noexcept (<uninstantiated>)"
> >              not-really-extern chain <function_decl 0x7ffff6d98d00
> > __dt_base >> context <translation_unit_decl 0x7ffff6c4a168 t.ii>
> >          full-name "struct S"
> >          X() X(constX&) this=(X&) n_parents=0 use_template=0 
> > interface-unknown
> >          pointer_to_this <pointer_type 0x7ffff6da85e8>
> > reference_to_this <reference_type 0x7ffff6da8a80> chain <type_decl
> > 0x7ffff6c6b850 S>>
> >      public static tree_1 tree_2 tree_3 BLK t.ii:3:3 size <integer_cst
> > 0x7ffff6c5e0a8 32> unit-size <integer_cst 0x7ffff6c5e0c0 4>
> >      align:32 warn_if_not_align:0 context <translation_unit_decl
> > 0x7ffff6c4a168 t.ii> initial <constructor 0x7ffff6da52a0> chain
> > <type_decl 0x7ffff6c6b850 S>>
> >
> > and we assemble it like
> >
> >          .globl  s
> >          .data
> >          .align 4
> >          .type   s, @object
> >          .size   s, 4
> > s:
> >          .long   5
> >          .string "abcde"
> >
> > note how we have .size s, 4 here...
> >
> > of course in the end nobody cares about a symbols size ...(?)
> >
> > I expected layout_decl to fixup DECL_SIZE according to the iniitalizer
> > but it looks like the initializer isn't present when it is layouted.
> >
> > I think this is a middle-end flaw we should try to fix that either in
> > the FEs or in the middle-end.  The C FE calls 
> > add_flexible_array_elts_to_size
> > and complete_flexible_array_elts in its finish_decl routine (layouting 
> > happens
> > at build_decl time, too early, but if we want to move this functionality 
> > then
> > re-layouting might be posible).  I'm not sure about other FEs but I guess
> > the easiest point we could do sanity checking is when assemble_variable
> > outputs the constructor.
>
> I don't know if I followed all this but it sounds like you're
> saying this is a separate issue.

Yeah, it looks like a separate issue to me but one that would be
nice to have fixed so we can compute the length of the trailing
array without needing to parse the initializer.  I'd even say we should
delay that effort if the above is confirmed as a bug and a fix is
possible.  Jason, is the above intended behavior?  I realize
"trailing" arrays is a GNU extension in C++ but as seen above
it doesn't match GNU C behavior here.

As said, I'm not sure if a wrong symbol size has an impact on
any target binary format we support.

> >
> >> Attached is an updated patch that uses fold_ctor_reference as you
> >> suggested.  I've also made a few other minor changes to diagnose
> >> a few more invalid strlen calls with out-of-bounds offsets.
> >> (More still remain.)
> >
> > Thanks, but the patch was already large ... it would be nice to strip it
> > down to the bare bones again and do followup adjustments instead.
>
> I'm not sure what you view as the bare bones bits but I took
> a guess and split off the changes for PR 91490 (missing folding
> of constant flexible array members) into a separate patch and
> posted that.  Once that's approved I'll update this one and post
> what's left of it.

Thanks,
Richard.

> Martin
>
> >
> > Richard.
> >
> >> Martin
> >>
> >>>
> >>> Richard.
> >>>
> >>>>
> >>>> Martin
> >>>>
> >>>> PS I imagine the new get_flexarray_size function will probably
> >>>> need to move somewhere more appropriate so that other warnings
> >>>> (like -Warray-bounds to remove the XFAILs) and optimizations
> >>>> can make use of it.
> >>
>

Reply via email to