On 9/1/2012 9:59 AM, James Dennett wrote: > On Fri, Aug 31, 2012 at 2:55 PM, John Nagle <na...@animats.com> > wrote: >> We have proposed an extension to C (primarily) and C++ (possibly) >> to address buffer overflow prevention. Buffer overflows are still >> a huge practical problem in C, and much important code is still >> written in C. This is a new approach that may actually work. ... > Could you say a little more of why it appears necessary to introduce > references into C for this? The reason I'm puzzled is that C already > has the ability to pass arrays in a way that preserves their size > (just pass the address of the array) -- what is it that references > change in this picture that justifies such a radical change? Could > we just permit pointer-to-array-of-n elements to convert to > pointer-to-array-of-(n-1) elements, and/or provide some way to slice > explicitly?
That's an important point. C99 already has variable-length array parameters: int fn(size_t n, float vec[n]); Unfortunately, when the parameter is received in the function body, per N1570 ยง6.7.6.3p7: 'A declaration of a parameter as "array of _type_" shall be adjusted to "qualified pointer to _type_", where the type qualifiers (if any) are those specified within the [ and ] of the array type derivation.' What this means is that, in the body of the function, "vec" has type "float *", and "sizeof vec" is the size of a pointer. The standard currently requires losing the size of the array. While C99 variable-length array parameters aren't used much (searches of open-source code have failed to find any use cases, Microsoft refuses to implement them, and N1570 makes them optional), these semantics also apply to passing fixed-length arrays: int fn(float vec[4]); As before, "vec" is delivered as "float* vec". The constant case is widely used, and changing the semantics there might silently break existing code that uses "sizeof". We had a go-round on this on comp.std.c, and the conclusion was that changing the semantics of C array passing would break too much. The real reason for using references is that size information is needed in other places than parameters. It's needed in return types, on the left side of assignments, in casts, and in structures. References to arrays have associated information; pointers don't. As for slicing, see "array_slice" in the paper. It's not a built-in; it's a macro that uses "decltype" and a cast to generate the appropriate result type. Personally, I'd like to have a Python-like slicing notation: arr[start:endplus1] but that's not essential to the proposal, so I'm not suggesting it. > Of course to make this succeed you'll need buy-in from implementors > and of the standards committee(s), who will need to trust that the > other (and therefore that users) will find this worth the cost. It > generally takes a lot of work (in terms of robust specification and > possibly implementation in a fork of an open source compiler or two) > to generate the consensus necessary for a proposal to succeed. > Something that might ultimately seek to change or even disallow much > existing C code has an even higher bar -- getting an ISO committee to > remove existing support is no small achievement (e.g., look at how > long gets() persisted). I'd love to see a reduction in the number of > buffer overruns that are present in code, but it's an uphill > struggle. Of course. Support may come from the security community. CERT still reports buffer overflows, usually in C/C++ code, as the single biggest source of vulnerabilities. Vulnerabilities in software are now a public policy level issue. In the last week, software attacks have taken down Saudi Aramco and RasGas, two of the world's largest energy producers. This issue is growing in importance as "info-war" moves from a potential threat to reality. It's now something that has to be fixed. John Nagle