On Sat, Sep 1, 2012 at 8:35 PM, John Nagle <[email protected]> wrote:
> On 9/1/2012 9:59 AM, James Dennett wrote:
>> On Fri, Aug 31, 2012 at 2:55 PM, John Nagle <[email protected]>
>> wrote:
>>> We have proposed an extension to C (primarily) and C++ (possibly)
>>> to address buffer overflow prevention. Buffer overflows are still
>>> a huge practical problem in C, and much important code is still
>>> written in C. This is a new approach that may actually work.
> ...
>> Could you say a little more of why it appears necessary to introduce
>> references into C for this? The reason I'm puzzled is that C already
>> has the ability to pass arrays in a way that preserves their size
>> (just pass the address of the array) -- what is it that references
>> change in this picture that justifies such a radical change? Could
>> we just permit pointer-to-array-of-n elements to convert to
>> pointer-to-array-of-(n-1) elements, and/or provide some way to slice
>> explicitly?
>
> That's an important point. C99 already has variable-length
> array parameters:
>
> int fn(size_t n, float vec[n]);
>
> Unfortunately, when the parameter is received in the function body,
> per N1570 ยง6.7.6.3p7: 'A declaration of a parameter as "array of _type_"
> shall be adjusted to "qualified pointer to _type_", where the type
> qualifiers (if any) are those specified within the [ and ] of
> the array type derivation.'
>
> What this means is that, in the body of the function,
> "vec" has type "float *", and "sizeof vec" is the size of
> a pointer. The standard currently requires losing the size
> of the array.
>
> While C99 variable-length array parameters aren't used much
> (searches of open-source code have failed to find any use
> cases, Microsoft refuses to implement them, and N1570 makes
> them optional), these semantics also apply to passing
> fixed-length arrays:
>
> int fn(float vec[4]);
>
> As before, "vec" is delivered as "float* vec". The constant
> case is widely used, and changing the semantics there might silently
> break existing code that uses "sizeof". We had a go-round on
> this on comp.std.c, and the conclusion was that changing the
> semantics of C array passing would break too much.
>
> The real reason for using references is that size information
> is needed in other places than parameters. It's needed in
> return types, on the left side of assignments, in casts, and
> in structures. References to arrays have associated information;
> pointers don't.
That's my point/question: pointers to arrays have exactly the same
information associated with them as references to arrays, and exist in
C today (and have existed in C for decades, they're just not used in
conventional idioms). You seem to be asserting something about
pointers that is not true. Granted, the pointer-to-first-element that
an array implicitly decays to in most contexts loses that information,
but that is not a pointer to the array (though for multi-dimensional
arrays, i.e., arrays of arrays, it is a pointer to a subarray and its
type information records the size of that subarray).
A concrete example: Given the declaration
void fn(int (*pointer_to_array)[5]);
inside fn, pointer_to_array has type pointer to array of 5 ints, and
sizeof(*pointer_to_array) is 5*sizeof(int). Callers can't pass in an
array of the wrong size without a cast. Unfortunately right now they
can't even pass in a _larger_ array without a cast, though I think
that would be relatively minor surgery to the languages (C and C++).
A call to fn looks like fn(&array), e.g.,
int main(void) {
int array[5];
fn(&array);
}
Moving to references instead of pointers seems to give little except
avoiding the need to write "&", which C programmers are already
comfortable with.
Maybe what you need is an extension of VLA-like syntax to something more like
void fn(int (*pointer_to_array)[n], size_t n)
though while C has gone down the path of allowing sizeof(variable) to
vary at compile time, C++ has not and would, I expect, be likely to
resist such a change.
-- James