hokein added a comment. Thanks, the change looks good in general.
> we handle *all* rules for the interesting node types explicitly, rather than > default: return false. This allows us to assert that all cases are handled, > so things don't "fall through the cracks" after grammar changes. Alternative > suggestions welcome. (I have a feeling this "exhaustive pattern-matching" > idea will come up again...) The dangerous bit is that now we will crash at the runtime if the parsing code triggers a missing case. Yeah, I hit this problem in the previous function-declarator patch. One approach will be to define an enum for each nonterminal `enum Nonterminal { rhs0_rhs1 }`, rather than put all rules into a single enum. It is easier to enumerate all values for a single nonterminal (I think this is the common case) namespace cxx { namespace rule { enum simple_type_specifier { builtin_type0, nested_name_specifier0_type_name1, ... } } } In D130337#3671159 <https://reviews.llvm.org/D130337#3671159>, @sammccall wrote: > FWIW, as-is with no caching, this is a ~2% slowdown on my machine (5.82 -> > 5.72 MB/s on SemaCodeComplete.cpp). Actually, the number looks quite good to me (I expected that there will be a significant slowdown without caching). I think we can check in the current version, and add the caching stuff afterwards. > Whereas D130150 <https://reviews.llvm.org/D130150> using the grammar is a a > 7% speedup (5.82 -> 6.22), so roughly an 9% performance difference between > the approaches. Yeah, because the grammar-based approach reduces the number of ambiguous node in the forest. ================ Comment at: clang-tools-extra/pseudo/lib/cxx/CXX.cpp:284 + defining_type_specifier_seq_0defining_type_specifier_1defining_type_specifier_seq, + GUARD(!hasExclusiveType(P.RHS.front()) || + !hasExclusiveType(P.RHS.back()))}, ---------------- nit: I would suggest using the index explicitly `P.RHS[0]`, `P.RHS[1]`, it increases the readability (the rul name encoding the index, easier to spot the corresponding element). ================ Comment at: clang-tools-extra/pseudo/lib/cxx/cxx.bnf:353 decl-specifier-seq := decl-specifier -decl-specifier-seq := decl-specifier decl-specifier-seq +decl-specifier-seq := decl-specifier decl-specifier-seq [guard] storage-class-specifier := STATIC ---------------- offtopic comment: The sad bit of the `RuleID` approach (vs `guard=SingleType`) is that we don't really know what kind of the guard is by reading the grammar file only. I think this is critical information, and worth to keep in the grammar file. (ideas: add comments, or bring the `guard=SingleType` in the grammar again, but we ignore the `guard` value in the grammar parsing). ================ Comment at: clang-tools-extra/pseudo/lib/cxx/cxx.bnf:385 +builtin-type := BOOL simple-type-specifier := SHORT +builtin-type := INT ---------------- I think the reason to leave SHORT/LONG/SIGNED UNSIGNED as-is is that they can combined with other type (e.g. short int). Can we group them together, and add a comment? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D130337/new/ https://reviews.llvm.org/D130337 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits