Re: Request for comments on language extension: Safe arrays and pointers for C.

James Dennett Sat, 01 Sep 2012 22:59:45 -0700

On Sat, Sep 1, 2012 at 8:35 PM, John Nagle <na...@animats.com> wrote:
> On 9/1/2012 9:59 AM, James Dennett wrote:
>> On Fri, Aug 31, 2012 at 2:55 PM, John Nagle <na...@animats.com>
>> wrote:
>>> We have proposed an extension to C (primarily) and C++ (possibly)
>>> to address buffer overflow prevention.  Buffer overflows are still
>>> a huge practical problem in C, and much important code is still
>>> written in C.  This is a new approach that may actually work.
> ...
>> Could you say a little more of why it appears necessary to introduce
>> references into C for this?  The reason I'm puzzled is that C already
>> has the ability to pass arrays in a way that preserves their size
>> (just pass the address of the array) -- what is it that references
>> change in this picture that justifies such a radical change?  Could
>> we just permit pointer-to-array-of-n elements to convert to
>> pointer-to-array-of-(n-1) elements, and/or provide some way to slice
>> explicitly?
>
>    That's an important point.  C99 already has variable-length
> array parameters:
>
>         int fn(size_t n, float vec[n]);
>
> Unfortunately, when the parameter is received in the function body,
> per N1570 §6.7.6.3p7: 'A declaration of a parameter as "array of _type_"
> shall be adjusted to "qualified pointer to _type_", where the type
> qualifiers (if any) are those specified within the [ and ] of
> the array type derivation.'
>
> What this means is that, in the body of the function,
> "vec" has type "float *", and "sizeof vec" is the size of
> a pointer.  The standard currently requires losing the size
> of the array.
>
> While C99 variable-length array parameters aren't used much
> (searches of open-source code have failed to find any use
> cases, Microsoft refuses to implement them, and N1570 makes
> them optional), these semantics also apply to passing
> fixed-length arrays:
>
>         int fn(float vec[4]);
>
> As before, "vec" is delivered as "float* vec".  The constant
> case is widely used, and changing the semantics there might silently
> break existing code that uses "sizeof".  We had a go-round on
> this on comp.std.c, and the conclusion was that changing the
> semantics of C array passing would break too much.
>
> The real reason for using references is that size information
> is needed in other places than parameters.  It's needed in
> return types, on the left side of assignments, in casts, and
> in structures.  References to arrays have associated information;
> pointers don't.


That's my point/question: pointers to arrays have exactly the same
information associated with them as references to arrays, and exist in
C today (and have existed in C for decades, they're just not used in
conventional idioms).  You seem to be asserting something about
pointers that is not true.  Granted, the pointer-to-first-element that
an array implicitly decays to in most contexts loses that information,
but that is not a pointer to the array (though for multi-dimensional
arrays, i.e., arrays of arrays, it is a pointer to a subarray and its
type information records the size of that subarray).

A concrete example: Given the declaration
  void fn(int (*pointer_to_array)[5]);
inside fn, pointer_to_array has type pointer to array of 5 ints, and
sizeof(*pointer_to_array) is 5*sizeof(int).  Callers can't pass in an
array of the wrong size without a cast.  Unfortunately right now they
can't even pass in a _larger_ array without a cast, though I think
that would be relatively minor surgery to the languages (C and C++).

A call to fn looks like fn(&array), e.g.,

int main(void) {
  int array[5];
  fn(&array);
}

Moving to references instead of pointers seems to give little except
avoiding the need to write "&", which C programmers are already
comfortable with.

Maybe what you need is an extension of VLA-like syntax to something more like
  void fn(int (*pointer_to_array)[n], size_t n)
though while C has gone down the path of allowing sizeof(variable) to
vary at compile time, C++ has not and would, I expect, be likely to
resist such a change.

-- James

Re: Request for comments on language extension: Safe arrays and pointers for C.

Reply via email to