Hopefully this will all go right this time. If mods can delete the
other 3 posts I'd be much obliged. I just needed to put all these
files in the same post, and gmail is being a nuisance.

RES is my project to test (at compile time) whether C++ code can throw
unhandled exceptions.

I tidied up my source, made a patch, wrote and tested some testcases,
and wrote some documentation.

There should be three attachments here:
- res.diff
- res.txt
- test.tar.gz

I've never made patch files before, but hopefully this one is OK.
I used [diff -Naur original/gcc-3.4.2/gcc res/gcc-3.4.2/gcc > res.diff]
I tested it on a fresh gcc-3.4.2 extract and it worked with [patch -p1
< res.diff]

For a quick test, just call gcc with -Wres (or -Wres-debug to add the
debug info).
I haven't tested on much real-world code, there's a chance it may
still have bugs and segfault or something. It's still pre-alpha
though.

I tried to use GCC coding style but I didn't see an 80 character/line
limit anywhere in the document, so I didn't use any limit.
Perhaps someone will bite me for this or perhaps you'll all celebrate
the beginning of a new era of sensible unlimited lines :) anyway it's
pre-alpha so it hardly matters at present - it's just far easier for
me to read.

Testcases:
Unzip test.tar.gz and read test/tests.txt for instructions.
All the testcases except the last work as they should.

Attachment: res.diff.tar.gz
Description: GNU Zip compressed data

RES: Restrictive Exception Specification.
An extension to G++, the C++ compiler from GCC.
By Simon Hill.
Under the same license as GCC.



========
CONTENTS
========
Overview
RES Principles
Usage Tips
Whitelisting
XES
TT
CNR
MES
Implementation
Comparison with EDoc++
C++ extensions to aid RES
Design Flaws
Todo



========
OVERVIEW
========
RES is a mechanism to provide a warning whenever code is compiled that may lead 
to an unhandled exception.
RES is currently pre-alpha, ie under construction.

eg:
~~~~~~~~~~~~~~~~~~~~~
foo.cpp:
~~~~~~~~~~~~~~~~~~~~~
void foo() throw(int)
{
  throw 1.5f;
}
~~~~~~~~~~~~~~~~~~~~~
> gcc -c foo.cpp -Wres
foo.cpp:1: warning: RES: ‘void foo() throw (int)’ may terminate due to the 
following uncaught and unspecified exceptions:
foo.cpp:3: note: RES: ‘float’ from here.
~~~~~~~~~~~~~~~~~~~~~

RES is invoked by -Wres,. See [Whitelisting] for info on <rules>.

RES considers exception propagation through try/catch blocks, and analyses the 
exception specifications of called functions.

XES & TT:
RES comes with four complimentary but probably-less-useful mechanisms:
- XES (excessive exception specification), invoked by -Wres-xes. See [XES].
- TT (throw-terminate), invoked by -Wres-tt. See [TT].
- CNR (catch(...) no rethrow), invoked by -Wres-cnr. See [CNR]
- MES (missing exception specification), invoked by -Wres-cnr. See [MES]
Note: If any of RES, XES or TT are used, most exception checking routines are 
enabled. (FIXME:Check) I doubt these affect compilation time significantly.

Note: RES by default ignores calls to functions declared in system includes 
(eg: <vector>). However, this can be changed by using -Wres.



==============
RES PRINCIPLES
==============
The RES project was designed to uphold design principles that I and others seem 
to have developed independantly, which I shall here call RES Principles.

I believe it should be fully possible for a C++ compiler to check exception 
propagation at compile time.
There has been much discussion on whether this should be so, including lots on 
the gcc@gcc.gnu.org mailing list.
The main issue is template code. Unfortunately, with the current C++ 
specification, template code cannot work properly with RES. However I suggest 
some minor changes to the spec in [C++ extensions to aid RES].


My RES Principles are:
1) Every thrown type that may propagate to the function boundary should be in 
the function specification.
   - IE, no unhandled thrown exceptions.
2) Every called function (whether explicit, implicit, function pointer or 
otherwise) should be considered able to throw all types in it's exception 
specification, regardless of the callee's implementation.
   - IE, calls should be treated the same as a sequence of throws of the types 
in the callee's specification.
3) Reachability of throws and calls is not considered. Non-reachable throws of 
types that would otherwise lead to unhandled exceptions are not allowed.
4) There should be no throw-terminates. (Re-throws outside of catch blocks).
5) Every thrown type should be caught explicitly. All catch(...) blocks should 
re-throw.
6) Every function should have an explicit exception specification (which may be 
a no-throw specification).
7) main() and other entrypoints should be defined as throw();

The RES project currently warns on a subset of principle violations, as many 
people may not agree with #5, #6 & #7. #5 & #6 can be enabled by -Wres-cnr and 
-Wres-mes respectively.

RES deliberately misinterprets the intent exception specifications as 
indications of what exception types should be allowed to propagate to the 
function boundary, instead of simply instructions to the compiler that alter 
compiled output.
With fully RES code, exception specifications would not need to influence 
compiler output to remain C++ compliant.

Developing with RES may require you to change your coding style.



=====
USAGE
=====
-Wres enables RES.
-Wres-l=<rules> Whitelisting for RES warnings. For information on <rules>, see 
[Whitelisting].
-Wres-xes enables XES. See [XES]
-Wres-tt enables TT. See [TT]
-Wres-cnr checks that all catch(...) blocks have rethrows. See [CNR]
-Wres-mes checks that all functions have exception specifications. See [MES]
-Wres-bl-unk blacklists declarations not inside files. See [Whitelisting].
-Wres-debug emits debugging information as additional warnings.



============
WHITELISTING
============
Whitelisting has been added to allow #including libraries (eg std) that don't 
use the restrictive exception specification concept without generating unwanted 
warnings.
Whitelisting prevents checking of calls to functions declared in whitelisted 
files. It does not prevent checking of calls originating from within those 
files. (Internal: Uses DECL_SOURCE_FILE(function_decl))
By default, all system-path files are whitelisted.
Files (eg system-path files) can be blacklisted by this mechanic also. Calls to 
functions declared in blacklisted files will be checked.

Usage:
Whitelisting is invoked by: -Wres-l=<rules>
<rules> is a set of filenames/paths separated by commas. eg: -Wres-l=foo.h,bar.h
If more than one rule matches the file, only the first matching rule is used.
Note: -Wres-l can be used more than once, adding additional rules.

Special Characters:
- Blacklist: prefix minus (-).
  eg: [-foo.h]
  Indicates that all matches to this rule should emit warnings.
  If omitted, rule is a whitelist and matches are prevented from emitting 
warnings.

- System: prefix caret (^).
  eg: [^vector]
  Indicates a system-path include, ie #included with anglebraces (eg: <foo>) as 
opposed to quotes (eg: "foo").
  Note: Files residing inside the system path are considered system files even 
if they are #included using quotes. Eg: [#include "/usr/include/vector"].

- Partial: suffix plus (+).
  eg: [dir1/f+].
  Indicates that this rule matches files beginning with the string, but may 
have further characters.
  If not used, the rule must be an exact match.

- Separator: (,).

- Escape character: percent (%).
  Treats the next character as a literal even if it is one of the above 
characters.
  Does not work for space or newline, use \ to escape these.

Note: When adding rules blacklisting system files, use [-^] not [^-].

A full example may be:
gcc foo.cpp -Wres-l=foo.h,^std+,-^+,-f+,fip.h,%^%,%+ -Wres-l=bar.h 
-Wres-resl=zug.h
calls to functions declared in:
["foo.h"] are ignored. [foo.h]
["fip.h"] are checked. [-f+]
[<stdlib.h>] are ignored. [^std+]
[<vector>] are checked. [-^+]
["bar.h"] are ignored. [bar.h]
["zug.h"] are ignored. [zug.h]
["^,+"] are ignored. [%^%,%+]

Some useful rules:
[-^+]: Blacklists all system files.
[+]: Whitelists all user files.

Declarations that don't exist in files, such as builtins, are whitelisted by 
default.
-Wres-bl-unk blacklists these declarations.

Ultimately, if fully RES compliant system libraries are made, -Wres-l=-^+ 
should be OK to use.



===
XES
===
XES is an associated mechanism, for warning about Excessive Exception 
Specification. (X is used instead of E because -Wres-xes is better than the 
alternative.)

eg:
~~~~~~~~~~~~~~~~~~~~~
foo.cpp:
~~~~~~~~~~~~~~~~~~~~~
void foo() throw(int)
{
  throw 1.5f;
}
~~~~~~~~~~~~~~~~~~~~~
> gcc -c foo.cpp -Wres-xes
foo.cpp:1: warning: XES: ‘void foo() throw (int)’s exception specification 
includes ‘int’, yet this type can't be thrown past it.
~~~~~~~~~~~~~~~~~~~~~

XES is invoked by -Wres-xes. See [Whitelisting] for info on <rules>.
XES also warns when catch blocks won't catch anything.

Due to lists not storing a type and more-derived versions of the same type at 
the same time, XES is over-optimistic (produces less warnings than it should).
Since XES isn't particularly useful, this is ignored.



==
TT
==
TT is another associated mechanism, for warning about throw-terminates 
(FIXME:Use correct jargon).
Throw-terminates are throw; statements (without an explicit thrown type) that 
are not catch rethrows.
Throw-terminates cause the program to terminate immediately.
Technically, they're either outside all catch blocks or the deepest try block 
they reside in is deeper than the deepest catch block. (FIXME:Check C++ spec).

eg:
~~~~~~~~~~~~~~~~~~~~~
foo.cpp:
~~~~~~~~~~~~~~~~~~~~~
void foo() throw(int)
{
  throw;
}
~~~~~~~~~~~~~~~~~~~~~
> gcc -c foo.cpp -Wres-tt
foo.cpp:3: warning: TT: Throw-terminate here.
~~~~~~~~~~~~~~~~~~~~~

TT is invoked by -Wres-tt.



===
CNR
===
CNR is a further associated mechanism, for warning about catch(...) blocks 
without rethrows.

eg:
~~~~~~~~~~~~~~~~~~~~~
foo.cpp:
~~~~~~~~~~~~~~~~~~~~~
void foo() throw(int)
{
  try {}
  catch(...) {}
}
~~~~~~~~~~~~~~~~~~~~~
> gcc -c foo.cpp -Wres-cnr
foo.cpp:3: warning: CNR: <FIX>
~~~~~~~~~~~~~~~~~~~~~

CNR is invoked by -Wres-cnr.



===
MES
===
MES is a yet another associated mechanism, for warning about function 
declarations without exception specifications.

eg:
~~~~~~~~~~~~~~~~~~~~~
foo.cpp:
~~~~~~~~~~~~~~~~~~~~~
void foo()
{}
~~~~~~~~~~~~~~~~~~~~~
> gcc -c foo.cpp -Wres-mes
foo.cpp:1: warning: MES: <FIX>
~~~~~~~~~~~~~~~~~~~~~

MES is invoked by -Wres-mes.



==============
IMPLEMENTATION
==============
I tried to keep RES as separate as possible from the rest of GCC, to be modular.
Thus the only things I add to existing functions are calls to RES and a few 
variables.
Without RES active, there's virtually no change to G++'s execution - only a few 
predictable if() statements now and then.
All my declarations are inside then newly created [gcc/res.h].

Trees & Lists
~~~~~~~~~~~~~
Three trees follow the state of the parser:
res_throwable is a TREE_LIST to which all thrown types are added, whether from 
throws or function call specifications.
res_catchable is a TREE_LIST of types catchable by catch-blocks in the current 
try-catch segment.
- If outside all catch blocks, or if a try block is deeper, this is 
void_type_node.
- After catch(...) this void_type_node until a rethrow is seen, when it becomes 
NULL_TREE.
res_caught is the type caught by the deepest catch block.
- If outside all catch blocks, or if a try block is deeper, this is 
void_type_node.
- On catch(...), this is a TREE_LIST of everything that was previously in 
res_catchable. (Possibly ... or nothing).
Each tree node has a type (TREE_TYPE), throwing-object (TREE_PURPOSE) and 
location_t (TREE_VALUE).
NULL_TREE indicates an empty list.
A single-node TREE_LIST whose TREE_TYPE() is NULL_TREE indicates ...
No list can store both a type and a less-derived version of that same type, as 
the catching the less-derived version also catches the original type.

Method
~~~~~~
On entering a function or try-block, the previous res_throwable, res_catchable 
and res_caught trees are temporarily stored, to be recovered on exit.
On exiting a try-block, what was res_throwable becomes res_catchable. The 
previous res_throwable is recovered.
On entering a catch-block, the caught type is removed from res_catchable and 
stored in res_caught.
- For catch(...), the entire res_catchable is moved to res_caught.
On exiting a try-catch segment, all types remaining in res_catchable are added 
to res_throwable. The previous res_throwable and res_catchable are recovered.
On exiting a function, types in res_throwable are checked vs the function's 
exception specification.

Internal Functions
~~~~~~~~~~~~~~~~~~
res_check_res () : Checks RES at end of function.
res_check_xes () : Checks XES at end of function.
res_add_type () : Adds a type to a list.
res_add_any () : Makes a list ...
res_merge () : Moves all types on one list to the other.
res_remove () : Removes a type from a list.
res_empty () : Empties a list.
res_warnlist () : Dumps all types as warnings, for debugging.
res_prep_add_type () : Assists res_add_type () and res_merge ().
res_make_node () : Makes a new TREE_LIST node in the format stated above.

Source Files
~~~~~~~~~~~~
Temporarily, all modifications are labelled with /* <[SPH]> */
1 file has been added:
- gcc/res.h
5 files have been modified:
- gcc/c.opt
- gcc/c-opts.c
- gcc/cp/except.c
- gcc/cp/parser.c
- gcc/cp/semantics.c
The bulk of the implementation is in gcc/cp/except.c, at the end.
The whitelist implementation is at the end of gcc/c-opts.c.
gcc/cp/parser.c & gcc/cp/semantics.c just call my new functions.
c.opt just lists -Wres, -Wres-l=, -Wres-tt, -Wres-xes

GCC functions called by RES
~~~~~~~~~~~~~~~~~~~~~~~~~~~
- (FIXME:Some of these may not do exactly what I think they do - it's difficult 
to check but they seem to work.)
1) To get the thrown type of a throw expression, I use lvalue_type(expression).
2a) To get the exception specification from a function declaration, I use 
first_node = TYPE_RAISES_EXCEPTIONS(TREE_TYPE(function)).
2b) To parse an exception specification I use TREE_VALUE(node) to get the node 
type, and TREE_CHAIN(node) to get the next node.
3) To get the caught type of a handler tree, I use HANDLER_TYPE(handler).
4) To check whether one type is the same or derived from another, I use 
same_or_base_type_p(type1, type2).
5) To build TREE_LIST nodes I use tree_cons(purpose, value, chain).
6) To get the source file from a location_t, I use DECL_SOURCE_FILE(location).

RES functions called by GCC
~~~~~~~~~~~~~~~~~~~~~~~~~~~
cp/parser.c:15589 : cp_parser_try_block ()
- res_on_begin_try ()
- res_on_end_try ()
- res_on_end_handlers ()
- Temporarily stores tree lists for try-block nesting.
cp/parser.c:15681 : cp_parser_handler ()
- res_on_begin_catch ()
- res_on_end_catch ()
cp/parser.c:15767 : cp_parser_throw_expression ()
- res_on_throw ()
cp/parser.c:16799 : cp_parser_function_definition_after_declarator ()
- res_on_begin_function ()
- res_on_end_function ()
- Temporarily stores tree lists for function nesting.
cp/semantics.c:1840 : finish_call_expr ()
- res_on_call_function ()



======================
COMPARISON WITH EDOC++
======================
EDoc: http://edoc.sourceforge.net/ is a project with a similar purpose.
It aims to check the compiler output to see whether any thrown exceptions can 
propogate past exception specifications that don't handle them, and warn 
accordingly.
It also builds exception propagation documentation.

Note: If you're actually looking for something to check exception 
specifications, use EDoc++, RES is not ready for use.

MAIN DIFFERENCE
~~~~~~~~~~~~~~~
AFAIK, EDoc++ will not warn on the following code:
========================
void foo() throw(int) {}
void bar() throw()
{
  foo();
}
========================
Because foo() does not actually throw anything, even though its exception 
specifier says it _may_ throw int.
This may or may not be what you're after.
On the other hand, RES is only concerned with the declaration of foo(), and 
since it states it _may_ throw int, RES decides to warn.

-: EDoc is released. RES is pre-alpha.
-: -Wres tries to enforce RES principles (see [RES Principles] above), which 
may not suit many developers. EDoc++ is suitable for all code.
+: RES only checks during compilation. EDoc++ is an external program that 
checks compiler output, although it requires instructing g++ to output special 
information.
-: EDoc++ is likely far more accurate (at least for now).
-: EDoc++ is currently better suited for analysing implicit calls and template 
code.
0: RES is more restrictive.
-: EDoc++ warns on exceptions propagating past main(). Currently, RES does not.
+: RES works per source file. EDoc++ needs to run on the linked output to check 
all exceptions.
+: EDoc++ will never be able to check function pointer usage perfectly. RES 
will, although GCC will need to comply with the C++ spec (15.4 -3- : 
except.spec) before this can happen.
0: Both EDoc++ and RES will not be able to check forced function pointer 
typecasts that reduce specified exceptions. But this is poor programming anyway.
+: RES can prevent checking of specific incompatible libraries via 
whitelisting. EDoc++ AFAIK cannot.

To look at it another way:
RES is the theoretical/conventionist exception specification checker.
EDoc++ is the practical exception specification checker.



==========================
C++ EXTENSTIONS TO AID RES
==========================
RES design principles are (currently) probably impractical for heavily generic 
code.
However, I believe RES design principles offer much in the way of improving 
exception handling design.

With the following two additions to C++ exception specifications, RES design 
would no longer be impractical for anything.

1) Allowing exception specifications on function pointer typedefs and 
associated functions (eg in templates).
   This is explicitly denied in the C++ spec. 15.4 -1- "An 
exception-specification shall not appear in a typedef declaration". 
   I cannot see why this is, unless it is to make compiler programming easier.
   Perhaps it's due to 15.4 -12- "An exception-specification is not considered 
part of a function's type."

2) Explicit inheritance of exception specifications.
     Something like:
       void foo() throw(int);
       void bar() throw(float, foo());
     Where bar() may now throw int and float.
   For instance, this would allow std::vector<T>.insert() to inherit the 
specification of T::T(T&&), meaning std::vector.insert() may now safely have an 
exception specification.
   Under RES, a function calling insert on a std::vector<T>, when T::T(T&&) 
throw(int) has been declared, would now have to catch int or have int in its 
exception specification.

Adding #1 & #2 will not invalidate any currently accepted programs.
Usage of #1 requires no explanation, its purpose is self evident.
Usage of #2 may clutter up function declarations. It should be used only when 
necessary, such as in template code.

For a compiler that is already fully compliant, implementation of #1 and #2 
should be fairly easy.

Without #1 and #2, RES will never work fully.



============
DESIGN FLAWS
============
1) I use trees of TREE_LIST to store most things. This was done for consistency 
and because I couldn't figure out the allocation mechanics (which I would need 
to use my own struct).
2) I typecast TREE_LIST tree members to size_t and const char*. This is due to 
#1 and because it's efficient.
3) I haven't yet figured out how to clean up my TREE_LIST nodes.
4) I assume location_t can be safely cast to/from size_t. This is probably OK.
5) I use TREE_TYPE() for TREE_LIST members. This is probably OK.
6) I use "warning(0, etc)" instead of "warning(OPT_Wres, etc)" because 
originally OPT_Wres isn't always on. Needs a fix, will do later.
7) I add res.h to gcc/, not gcc/cp/, since gcc/c-opts.c needs to include it 
(for whitelist/blacklist option handling). This is probably OK.
8) I put the bulk of my code in gcc/cp/except.c. It could possibly go in 
gcc/cp/res.c to improve modularity. This is probably OK.
9) Types are not sorted in lists. This could possibly speed up some of the 
O(n*m) search algorithms used. This is probably OK since it maintains order and 
I doubt -Wres adds much to compilation time (this should be checked).
10) The symbols ^-+%, are used by -Wres-l='s args. This is due to them not 
needing to be escaped. Someone may have some better suggestions.
11) -Wres, -Wres-xes & -Wres-tt are very short. Perhaps I could use 
-Wres-ex-spec, -Wexc-ex-spec, -Wthrow-terminate.



====
TODO
====
Implement analysis of a variety of implicit function calls:
  operators.
  ctors/dtors of local variables.
  ctors/dtors of base classes.
  implicit conversions.
  overloaded functions.
  others I have forgotten.
Implement analysis of calls to function-pointers. (Needs GCC to be fixed.)
Implement analysis of calls inside template functions.
Get system directory the right way.
Decide how to handle main().
Analyse impact on compilation time. (Should be negligible).
Proper Testcases.
Officialy propose above modifications to C++ spec.
Finishing, internationalization, etc.


Attachment: test.tgz
Description: GNU Zip compressed data

Reply via email to