On 10/20/2015 09:48 AM, Bernd Schmidt wrote:


On 10/20/2015 05:31 PM, Martin Sebor wrote:
On 10/20/2015 07:20 AM, Bernd Schmidt wrote:
On 10/16/2015 09:34 PM, Martin Sebor wrote:
Thank you for the review. Attached is an updated patch that hopefully
addresses all your comments. I ran the check_GNU_style.sh script on
it to make sure I didn't miss something.  I've also added replies to
a few of your comments below.

Ok, thanks. However - the logic around the context struct in the patch
still seems fairly impenetrable to me, and I've been wondering whether a
simpler approach wouldn't work just as well. I came up with the patch
below, which just passes a tree_code context to recursive invocations
and increasing the upper bound by one only if not in an ARRAY_REF or
COMPONENT_REF context.

As far as I can tell, this produces the same warnings on the testcase
(modulo differences in type printout), except for the final few FA5_7
testcases, which I had doubts about anyway:

typedef struct FA5_7 {
   int i;
   char a5_7 [5][7];
} FA5_7;

     __builtin_offsetof (FA5_7, a5_7 [0][7]),         // { dg-warning
"index" }
     __builtin_offsetof (FA5_7, a5_7 [1][7]),         // { dg-warning
"index" }
     __builtin_offsetof (FA5_7, a5_7 [5][0]),         // { dg-warning
"index" }
     __builtin_offsetof (FA5_7, a5_7 [5][7]),         // { dg-warning
"index" }

Why wouldn't at least the first two be convered by the
one-past-the-end-is-ok rule?

Thanks for taking the time to try to simplify the patch!

Diagnosing the cases above is at the core of the issue (and can
be seen by substituting a5_7 into the oversimplified test case
in the bug).

In a 5 by 7 array, the offset computed for the out-of-bounds
a_5_7[0][7] is the same as the offset computed for the in-bounds
a_5_7[1][0].  The result might be unexpected because it suggests
that two "distinct" elements of an array share the same offset.

Yeah, but Joseph wrote:

   Given such a flexible array member, [anything][0] and
   [anything][1] are logically valid if [anything] is
   nonnegative (of course, the array might or might not
   have enough element at runtime).  [anything][2] is a
   reference to just past the end of an element of the
   flexible array member; [anything][3] is clearly invalid.

   Regarding the actual testcase, accepting a1_1[0][1] seems
   fine (it's referencing the just-past-end element of the
   valid array a1_1[0]).

So how's that different from a5_7[0][7]? In case it matters, a1_1 in the
earlier test that was discussed could not be a flexible array because it
was followed by another field in the same struct.

Let me try to explain using a couple of examples. Given

  struct S {
      a1_1 [1][1];
      int x;
  } s;

where a1_1 is an ordinary array of a single element

  offseof (struct S, a1_1[0][1])

is accepted as a GCC extension (and, IMO, should be made valid
in C) because a) there is no way to mistake the offset for the
offset of an element of the array at a valid index, and because
b) it yields to the same value as the following valid expression:

  (char*)&s.a1_1[0][1] - (char*)&s.a1_1;

However, given

  struct S {
      a5_7 [5][7];
      int x;
  } s;

the invalid offsetof expression

  offseof (struct S, a5_7[0][7])

should be diagnosed because a) it yields the same offset of the
valid

  offseof (struct S, a5_7[1][0])

and b) because s.a5_7[0][7] is not a valid reference in any context
(i.e., (char*)&s.a5_7[0][7] - (char*)&s.a5_7 is not a valid expression).


Handing over review duties to Joseph because he'll have to decide what
to warn about and what not to.

    struct S {
        int A [5][7];
        int x;
    } s;

these should both be diagnosed:

    int i = offsetof (struct S, A [0][7]);

    int *p = &s.A [0][7];

because they are both undefined and both can lead to surprising
results when used.

Is that really undefined? I thought it was explicitly allowed.
Dereferencing the pointer is what's undefined.

Both are undefined. The most succinct statements to this effect are
in the Unde3fined Behavior section of Annex J:

  An array subscript is out of range, even if an object is apparently
  accessible with the given subscript (as in the lvalue expression
  a[1][7] given the declaration int a[4][5]) (6.5.6).

  The member designator parameter of an offsetof macro is an
  invalid right operand of the . operator for the type parameter,
  or designates a bit-field (7.19).

Martin

Reply via email to