C runtime checking for assigment of VM types, v3

Martin Uecker Mon, 15 Jul 2024 00:20:07 -0700


This is the third revision for my patch series to check bounds
consistency at run-time when assigning VM types.  Relative
to the last version, mostly the tests were simplified and some
coding style issues fixed.



It adds a new code instrumentation option that inserts
run-time checks to ensure bounds are matching. This helps 
with bounds safe programming and also finds problems in
numerical code, e.g. when bound are swapped in
multi-dimensional arrays

void foo(int x, int y, double m[x][y]);

double m[10][20];
foo(20, 10, m);

where currently do not get a warning or check.

(After updating this patch series and testing it, it
found a bug in new code added recently to my BART
project - a numerical toolbox for image reconstruction and
machine learning for magnetic resonance imaging.)



The patches in the series do:

1. Checks simple assignments.

2. In addition checks function calls under the assumption
that size expressions are declared as in the definition.
This assumption goes beyond what ISO C guarantees (and is
one reason this is not part of the UB sanitizer, the other
is that there is no library support of this use case) but
any inconsistency would usually indicate a bug anyway and
we warn already by default with -Wvla-parameter 
(this now becomes really useful)

3. Checks function calls via pointers when possible.

4. Adds a warning if a function calls can not be
instrumented e.g. because size expressions do not simply
references to other parameters or variables. Also adds
documentation for the warning and about instrumentation
of function calls.


The code is fairly simple and FE only as during recursive type
checking in the comptypes family of function we can simply collect
all pairs of size expressions that are supposed to match. This
list is then further processed to emit simple checking code.

For functions the main complication is that we need to evalute
size expressions in the caller. We therefor only add
checking if all size expressions direcly refer to other
declarations, and simply give up for anything more complex.
This already covers the most important use cases though.


The code is also useful infrastructure for future compile-time
warnings, e.g. the simply example above should clearly be
diagnosed already compile time. 

I haven't implemented this yet, but this should be simple to add
by detecting obvious cases during processing of the list of size
expressions.

Other current limitations are:

The outermost bounds for functions parameters are not checked because
they are lost when the type is adjusted to a pointer. The right semantics
of checking those are also less obvious.

As mentioned above, for functions we only check very simple size
expressions that directly refer to a parameter or argument. It would
be useful to extend this to more complex expressions without side
effects, such 'n + 1' or maybe even 'n + m'. This would then cover
most use cases in numerical code.

A bounds violation just causes a run-time trap without error message.
This is sufficient for safety and debugging with a debugger, but one 
consider adding a short error message. (This would have to go into
libgcc I guess).



The instrumentation is guarded by a new instrumentation flag -fvla-bounds,
but runtime overhead should generally be very low as most checks are
removed by the optimizer, e.g.

void foo(int x, char (*buf)[x])
{
 bar(x, buf);
}

does not have any overhead with -O1 (we also might want to filter out
some obvious cases already in the FE). So I think this flag could be
a good addition to -fhardened after some testing.  Maybe it could even
be activated by default.


Finally, I am unsure about the use if signals in the tests.  Maybe this
is not supported everywhere?  Any recommendation is much appreciated.


Each patch was bootstrapped and regression tested on x86_64.


Martin

C runtime checking for assigment of VM types, v3

Reply via email to