Hopefully this will all go right this time. If mods can delete the other 3 posts I'd be much obliged. I just needed to put all these files in the same post, and gmail is being a nuisance.
RES is my project to test (at compile time) whether C++ code can throw unhandled exceptions. I tidied up my source, made a patch, wrote and tested some testcases, and wrote some documentation. There should be three attachments here: - res.diff - res.txt - test.tar.gz I've never made patch files before, but hopefully this one is OK. I used [diff -Naur original/gcc-3.4.2/gcc res/gcc-3.4.2/gcc > res.diff] I tested it on a fresh gcc-3.4.2 extract and it worked with [patch -p1 < res.diff] For a quick test, just call gcc with -Wres (or -Wres-debug to add the debug info). I haven't tested on much real-world code, there's a chance it may still have bugs and segfault or something. It's still pre-alpha though. I tried to use GCC coding style but I didn't see an 80 character/line limit anywhere in the document, so I didn't use any limit. Perhaps someone will bite me for this or perhaps you'll all celebrate the beginning of a new era of sensible unlimited lines :) anyway it's pre-alpha so it hardly matters at present - it's just far easier for me to read. Testcases: Unzip test.tar.gz and read test/tests.txt for instructions. All the testcases except the last work as they should.
res.diff.tar.gz
Description: GNU Zip compressed data
RES: Restrictive Exception Specification. An extension to G++, the C++ compiler from GCC. By Simon Hill. Under the same license as GCC. ======== CONTENTS ======== Overview RES Principles Usage Tips Whitelisting XES TT CNR MES Implementation Comparison with EDoc++ C++ extensions to aid RES Design Flaws Todo ======== OVERVIEW ======== RES is a mechanism to provide a warning whenever code is compiled that may lead to an unhandled exception. RES is currently pre-alpha, ie under construction. eg: ~~~~~~~~~~~~~~~~~~~~~ foo.cpp: ~~~~~~~~~~~~~~~~~~~~~ void foo() throw(int) { throw 1.5f; } ~~~~~~~~~~~~~~~~~~~~~ > gcc -c foo.cpp -Wres foo.cpp:1: warning: RES: âvoid foo() throw (int)â may terminate due to the following uncaught and unspecified exceptions: foo.cpp:3: note: RES: âfloatâ from here. ~~~~~~~~~~~~~~~~~~~~~ RES is invoked by -Wres,. See [Whitelisting] for info on <rules>. RES considers exception propagation through try/catch blocks, and analyses the exception specifications of called functions. XES & TT: RES comes with four complimentary but probably-less-useful mechanisms: - XES (excessive exception specification), invoked by -Wres-xes. See [XES]. - TT (throw-terminate), invoked by -Wres-tt. See [TT]. - CNR (catch(...) no rethrow), invoked by -Wres-cnr. See [CNR] - MES (missing exception specification), invoked by -Wres-cnr. See [MES] Note: If any of RES, XES or TT are used, most exception checking routines are enabled. (FIXME:Check) I doubt these affect compilation time significantly. Note: RES by default ignores calls to functions declared in system includes (eg: <vector>). However, this can be changed by using -Wres. ============== RES PRINCIPLES ============== The RES project was designed to uphold design principles that I and others seem to have developed independantly, which I shall here call RES Principles. I believe it should be fully possible for a C++ compiler to check exception propagation at compile time. There has been much discussion on whether this should be so, including lots on the gcc@gcc.gnu.org mailing list. The main issue is template code. Unfortunately, with the current C++ specification, template code cannot work properly with RES. However I suggest some minor changes to the spec in [C++ extensions to aid RES]. My RES Principles are: 1) Every thrown type that may propagate to the function boundary should be in the function specification. - IE, no unhandled thrown exceptions. 2) Every called function (whether explicit, implicit, function pointer or otherwise) should be considered able to throw all types in it's exception specification, regardless of the callee's implementation. - IE, calls should be treated the same as a sequence of throws of the types in the callee's specification. 3) Reachability of throws and calls is not considered. Non-reachable throws of types that would otherwise lead to unhandled exceptions are not allowed. 4) There should be no throw-terminates. (Re-throws outside of catch blocks). 5) Every thrown type should be caught explicitly. All catch(...) blocks should re-throw. 6) Every function should have an explicit exception specification (which may be a no-throw specification). 7) main() and other entrypoints should be defined as throw(); The RES project currently warns on a subset of principle violations, as many people may not agree with #5, #6 & #7. #5 & #6 can be enabled by -Wres-cnr and -Wres-mes respectively. RES deliberately misinterprets the intent exception specifications as indications of what exception types should be allowed to propagate to the function boundary, instead of simply instructions to the compiler that alter compiled output. With fully RES code, exception specifications would not need to influence compiler output to remain C++ compliant. Developing with RES may require you to change your coding style. ===== USAGE ===== -Wres enables RES. -Wres-l=<rules> Whitelisting for RES warnings. For information on <rules>, see [Whitelisting]. -Wres-xes enables XES. See [XES] -Wres-tt enables TT. See [TT] -Wres-cnr checks that all catch(...) blocks have rethrows. See [CNR] -Wres-mes checks that all functions have exception specifications. See [MES] -Wres-bl-unk blacklists declarations not inside files. See [Whitelisting]. -Wres-debug emits debugging information as additional warnings. ============ WHITELISTING ============ Whitelisting has been added to allow #including libraries (eg std) that don't use the restrictive exception specification concept without generating unwanted warnings. Whitelisting prevents checking of calls to functions declared in whitelisted files. It does not prevent checking of calls originating from within those files. (Internal: Uses DECL_SOURCE_FILE(function_decl)) By default, all system-path files are whitelisted. Files (eg system-path files) can be blacklisted by this mechanic also. Calls to functions declared in blacklisted files will be checked. Usage: Whitelisting is invoked by: -Wres-l=<rules> <rules> is a set of filenames/paths separated by commas. eg: -Wres-l=foo.h,bar.h If more than one rule matches the file, only the first matching rule is used. Note: -Wres-l can be used more than once, adding additional rules. Special Characters: - Blacklist: prefix minus (-). eg: [-foo.h] Indicates that all matches to this rule should emit warnings. If omitted, rule is a whitelist and matches are prevented from emitting warnings. - System: prefix caret (^). eg: [^vector] Indicates a system-path include, ie #included with anglebraces (eg: <foo>) as opposed to quotes (eg: "foo"). Note: Files residing inside the system path are considered system files even if they are #included using quotes. Eg: [#include "/usr/include/vector"]. - Partial: suffix plus (+). eg: [dir1/f+]. Indicates that this rule matches files beginning with the string, but may have further characters. If not used, the rule must be an exact match. - Separator: (,). - Escape character: percent (%). Treats the next character as a literal even if it is one of the above characters. Does not work for space or newline, use \ to escape these. Note: When adding rules blacklisting system files, use [-^] not [^-]. A full example may be: gcc foo.cpp -Wres-l=foo.h,^std+,-^+,-f+,fip.h,%^%,%+ -Wres-l=bar.h -Wres-resl=zug.h calls to functions declared in: ["foo.h"] are ignored. [foo.h] ["fip.h"] are checked. [-f+] [<stdlib.h>] are ignored. [^std+] [<vector>] are checked. [-^+] ["bar.h"] are ignored. [bar.h] ["zug.h"] are ignored. [zug.h] ["^,+"] are ignored. [%^%,%+] Some useful rules: [-^+]: Blacklists all system files. [+]: Whitelists all user files. Declarations that don't exist in files, such as builtins, are whitelisted by default. -Wres-bl-unk blacklists these declarations. Ultimately, if fully RES compliant system libraries are made, -Wres-l=-^+ should be OK to use. === XES === XES is an associated mechanism, for warning about Excessive Exception Specification. (X is used instead of E because -Wres-xes is better than the alternative.) eg: ~~~~~~~~~~~~~~~~~~~~~ foo.cpp: ~~~~~~~~~~~~~~~~~~~~~ void foo() throw(int) { throw 1.5f; } ~~~~~~~~~~~~~~~~~~~~~ > gcc -c foo.cpp -Wres-xes foo.cpp:1: warning: XES: âvoid foo() throw (int)âs exception specification includes âintâ, yet this type can't be thrown past it. ~~~~~~~~~~~~~~~~~~~~~ XES is invoked by -Wres-xes. See [Whitelisting] for info on <rules>. XES also warns when catch blocks won't catch anything. Due to lists not storing a type and more-derived versions of the same type at the same time, XES is over-optimistic (produces less warnings than it should). Since XES isn't particularly useful, this is ignored. == TT == TT is another associated mechanism, for warning about throw-terminates (FIXME:Use correct jargon). Throw-terminates are throw; statements (without an explicit thrown type) that are not catch rethrows. Throw-terminates cause the program to terminate immediately. Technically, they're either outside all catch blocks or the deepest try block they reside in is deeper than the deepest catch block. (FIXME:Check C++ spec). eg: ~~~~~~~~~~~~~~~~~~~~~ foo.cpp: ~~~~~~~~~~~~~~~~~~~~~ void foo() throw(int) { throw; } ~~~~~~~~~~~~~~~~~~~~~ > gcc -c foo.cpp -Wres-tt foo.cpp:3: warning: TT: Throw-terminate here. ~~~~~~~~~~~~~~~~~~~~~ TT is invoked by -Wres-tt. === CNR === CNR is a further associated mechanism, for warning about catch(...) blocks without rethrows. eg: ~~~~~~~~~~~~~~~~~~~~~ foo.cpp: ~~~~~~~~~~~~~~~~~~~~~ void foo() throw(int) { try {} catch(...) {} } ~~~~~~~~~~~~~~~~~~~~~ > gcc -c foo.cpp -Wres-cnr foo.cpp:3: warning: CNR: <FIX> ~~~~~~~~~~~~~~~~~~~~~ CNR is invoked by -Wres-cnr. === MES === MES is a yet another associated mechanism, for warning about function declarations without exception specifications. eg: ~~~~~~~~~~~~~~~~~~~~~ foo.cpp: ~~~~~~~~~~~~~~~~~~~~~ void foo() {} ~~~~~~~~~~~~~~~~~~~~~ > gcc -c foo.cpp -Wres-mes foo.cpp:1: warning: MES: <FIX> ~~~~~~~~~~~~~~~~~~~~~ MES is invoked by -Wres-mes. ============== IMPLEMENTATION ============== I tried to keep RES as separate as possible from the rest of GCC, to be modular. Thus the only things I add to existing functions are calls to RES and a few variables. Without RES active, there's virtually no change to G++'s execution - only a few predictable if() statements now and then. All my declarations are inside then newly created [gcc/res.h]. Trees & Lists ~~~~~~~~~~~~~ Three trees follow the state of the parser: res_throwable is a TREE_LIST to which all thrown types are added, whether from throws or function call specifications. res_catchable is a TREE_LIST of types catchable by catch-blocks in the current try-catch segment. - If outside all catch blocks, or if a try block is deeper, this is void_type_node. - After catch(...) this void_type_node until a rethrow is seen, when it becomes NULL_TREE. res_caught is the type caught by the deepest catch block. - If outside all catch blocks, or if a try block is deeper, this is void_type_node. - On catch(...), this is a TREE_LIST of everything that was previously in res_catchable. (Possibly ... or nothing). Each tree node has a type (TREE_TYPE), throwing-object (TREE_PURPOSE) and location_t (TREE_VALUE). NULL_TREE indicates an empty list. A single-node TREE_LIST whose TREE_TYPE() is NULL_TREE indicates ... No list can store both a type and a less-derived version of that same type, as the catching the less-derived version also catches the original type. Method ~~~~~~ On entering a function or try-block, the previous res_throwable, res_catchable and res_caught trees are temporarily stored, to be recovered on exit. On exiting a try-block, what was res_throwable becomes res_catchable. The previous res_throwable is recovered. On entering a catch-block, the caught type is removed from res_catchable and stored in res_caught. - For catch(...), the entire res_catchable is moved to res_caught. On exiting a try-catch segment, all types remaining in res_catchable are added to res_throwable. The previous res_throwable and res_catchable are recovered. On exiting a function, types in res_throwable are checked vs the function's exception specification. Internal Functions ~~~~~~~~~~~~~~~~~~ res_check_res () : Checks RES at end of function. res_check_xes () : Checks XES at end of function. res_add_type () : Adds a type to a list. res_add_any () : Makes a list ... res_merge () : Moves all types on one list to the other. res_remove () : Removes a type from a list. res_empty () : Empties a list. res_warnlist () : Dumps all types as warnings, for debugging. res_prep_add_type () : Assists res_add_type () and res_merge (). res_make_node () : Makes a new TREE_LIST node in the format stated above. Source Files ~~~~~~~~~~~~ Temporarily, all modifications are labelled with /* <[SPH]> */ 1 file has been added: - gcc/res.h 5 files have been modified: - gcc/c.opt - gcc/c-opts.c - gcc/cp/except.c - gcc/cp/parser.c - gcc/cp/semantics.c The bulk of the implementation is in gcc/cp/except.c, at the end. The whitelist implementation is at the end of gcc/c-opts.c. gcc/cp/parser.c & gcc/cp/semantics.c just call my new functions. c.opt just lists -Wres, -Wres-l=, -Wres-tt, -Wres-xes GCC functions called by RES ~~~~~~~~~~~~~~~~~~~~~~~~~~~ - (FIXME:Some of these may not do exactly what I think they do - it's difficult to check but they seem to work.) 1) To get the thrown type of a throw expression, I use lvalue_type(expression). 2a) To get the exception specification from a function declaration, I use first_node = TYPE_RAISES_EXCEPTIONS(TREE_TYPE(function)). 2b) To parse an exception specification I use TREE_VALUE(node) to get the node type, and TREE_CHAIN(node) to get the next node. 3) To get the caught type of a handler tree, I use HANDLER_TYPE(handler). 4) To check whether one type is the same or derived from another, I use same_or_base_type_p(type1, type2). 5) To build TREE_LIST nodes I use tree_cons(purpose, value, chain). 6) To get the source file from a location_t, I use DECL_SOURCE_FILE(location). RES functions called by GCC ~~~~~~~~~~~~~~~~~~~~~~~~~~~ cp/parser.c:15589 : cp_parser_try_block () - res_on_begin_try () - res_on_end_try () - res_on_end_handlers () - Temporarily stores tree lists for try-block nesting. cp/parser.c:15681 : cp_parser_handler () - res_on_begin_catch () - res_on_end_catch () cp/parser.c:15767 : cp_parser_throw_expression () - res_on_throw () cp/parser.c:16799 : cp_parser_function_definition_after_declarator () - res_on_begin_function () - res_on_end_function () - Temporarily stores tree lists for function nesting. cp/semantics.c:1840 : finish_call_expr () - res_on_call_function () ====================== COMPARISON WITH EDOC++ ====================== EDoc: http://edoc.sourceforge.net/ is a project with a similar purpose. It aims to check the compiler output to see whether any thrown exceptions can propogate past exception specifications that don't handle them, and warn accordingly. It also builds exception propagation documentation. Note: If you're actually looking for something to check exception specifications, use EDoc++, RES is not ready for use. MAIN DIFFERENCE ~~~~~~~~~~~~~~~ AFAIK, EDoc++ will not warn on the following code: ======================== void foo() throw(int) {} void bar() throw() { foo(); } ======================== Because foo() does not actually throw anything, even though its exception specifier says it _may_ throw int. This may or may not be what you're after. On the other hand, RES is only concerned with the declaration of foo(), and since it states it _may_ throw int, RES decides to warn. -: EDoc is released. RES is pre-alpha. -: -Wres tries to enforce RES principles (see [RES Principles] above), which may not suit many developers. EDoc++ is suitable for all code. +: RES only checks during compilation. EDoc++ is an external program that checks compiler output, although it requires instructing g++ to output special information. -: EDoc++ is likely far more accurate (at least for now). -: EDoc++ is currently better suited for analysing implicit calls and template code. 0: RES is more restrictive. -: EDoc++ warns on exceptions propagating past main(). Currently, RES does not. +: RES works per source file. EDoc++ needs to run on the linked output to check all exceptions. +: EDoc++ will never be able to check function pointer usage perfectly. RES will, although GCC will need to comply with the C++ spec (15.4 -3- : except.spec) before this can happen. 0: Both EDoc++ and RES will not be able to check forced function pointer typecasts that reduce specified exceptions. But this is poor programming anyway. +: RES can prevent checking of specific incompatible libraries via whitelisting. EDoc++ AFAIK cannot. To look at it another way: RES is the theoretical/conventionist exception specification checker. EDoc++ is the practical exception specification checker. ========================== C++ EXTENSTIONS TO AID RES ========================== RES design principles are (currently) probably impractical for heavily generic code. However, I believe RES design principles offer much in the way of improving exception handling design. With the following two additions to C++ exception specifications, RES design would no longer be impractical for anything. 1) Allowing exception specifications on function pointer typedefs and associated functions (eg in templates). This is explicitly denied in the C++ spec. 15.4 -1- "An exception-specification shall not appear in a typedef declaration". I cannot see why this is, unless it is to make compiler programming easier. Perhaps it's due to 15.4 -12- "An exception-specification is not considered part of a function's type." 2) Explicit inheritance of exception specifications. Something like: void foo() throw(int); void bar() throw(float, foo()); Where bar() may now throw int and float. For instance, this would allow std::vector<T>.insert() to inherit the specification of T::T(T&&), meaning std::vector.insert() may now safely have an exception specification. Under RES, a function calling insert on a std::vector<T>, when T::T(T&&) throw(int) has been declared, would now have to catch int or have int in its exception specification. Adding #1 & #2 will not invalidate any currently accepted programs. Usage of #1 requires no explanation, its purpose is self evident. Usage of #2 may clutter up function declarations. It should be used only when necessary, such as in template code. For a compiler that is already fully compliant, implementation of #1 and #2 should be fairly easy. Without #1 and #2, RES will never work fully. ============ DESIGN FLAWS ============ 1) I use trees of TREE_LIST to store most things. This was done for consistency and because I couldn't figure out the allocation mechanics (which I would need to use my own struct). 2) I typecast TREE_LIST tree members to size_t and const char*. This is due to #1 and because it's efficient. 3) I haven't yet figured out how to clean up my TREE_LIST nodes. 4) I assume location_t can be safely cast to/from size_t. This is probably OK. 5) I use TREE_TYPE() for TREE_LIST members. This is probably OK. 6) I use "warning(0, etc)" instead of "warning(OPT_Wres, etc)" because originally OPT_Wres isn't always on. Needs a fix, will do later. 7) I add res.h to gcc/, not gcc/cp/, since gcc/c-opts.c needs to include it (for whitelist/blacklist option handling). This is probably OK. 8) I put the bulk of my code in gcc/cp/except.c. It could possibly go in gcc/cp/res.c to improve modularity. This is probably OK. 9) Types are not sorted in lists. This could possibly speed up some of the O(n*m) search algorithms used. This is probably OK since it maintains order and I doubt -Wres adds much to compilation time (this should be checked). 10) The symbols ^-+%, are used by -Wres-l='s args. This is due to them not needing to be escaped. Someone may have some better suggestions. 11) -Wres, -Wres-xes & -Wres-tt are very short. Perhaps I could use -Wres-ex-spec, -Wexc-ex-spec, -Wthrow-terminate. ==== TODO ==== Implement analysis of a variety of implicit function calls: operators. ctors/dtors of local variables. ctors/dtors of base classes. implicit conversions. overloaded functions. others I have forgotten. Implement analysis of calls to function-pointers. (Needs GCC to be fixed.) Implement analysis of calls inside template functions. Get system directory the right way. Decide how to handle main(). Analyse impact on compilation time. (Should be negligible). Proper Testcases. Officialy propose above modifications to C++ spec. Finishing, internationalization, etc.
test.tgz
Description: GNU Zip compressed data