The following paper provides some background on the difficulties encountered with parsing C++:
http://citeseer.ist.psu.edu/irwin01generated.html Abstract: C++ is an extraordinarily difficult programming language to parse. The language cannot readily be approximated with an LL or LR grammar (regardless of lookahead size), and syntax analysis depends on semantic disambiguation. While conventional (LALR(1) and LL(k)) parser generation tools have been used to build C++ parsers, the effort involved in grammar modification and custom code development is substantial, rivaling the effort of constructing a parser manually. [...] Link to PDF: http://tinyurl.com/3remp And a related thread on the GCC mailing list back in 2002: http://gcc.gnu.org/ml/gcc/2002-08/msg00085.html