sammccall added a comment. I think my main suggestion is to sort the symbols, not just the rules. Having the rules in blocks grouped by the symbol they produce, but not in symbol order, seems confusing to reason about.
The extraction of TestGrammar seems unrelated to the rest of the patch and it *isn't* actually shared between tests yet, which makes it a bit hard to review. ================ Comment at: clang-tools-extra/pseudo/include/clang-pseudo/Grammar.h:168 - // The rules are sorted (and thus grouped) by target symbol. + // The rules are topological sorted (and thus grouped) by the target symbol. + // ---------------- I think it's easier to understand if we sort the *symbols* and leave this comment as-is. It feels a bit strange to sort the symbols alphabetically but sort the rules by some other property of the symbols. It's also important to describe what relation we're sorting by! I'd say "The symbols are topologically sorted: if `S := T` then S has a higher SymbolID than T." ================ Comment at: clang-tools-extra/pseudo/lib/GrammarBNF.cpp:55 }; for (const auto &Spec : Specs) { Consider(Spec.Target); ---------------- You could identify dependencies here, and then use them to sort the uniquenonterminals before allocating SymbolIDs ================ Comment at: clang-tools-extra/pseudo/lib/GrammarBNF.cpp:122 + std::vector<unsigned> getTopologicalOrder(GrammarTable *T) { + llvm::DenseMap<SymbolID, llvm::DenseSet<SymbolID>> DependencyGraph; + for (const auto &Rule : T->Rules) { ---------------- The map seems unneccesary - just a vector<pair<SymbolID, SymbolID>> Dependencies, and sort them? ================ Comment at: clang-tools-extra/pseudo/lib/GrammarBNF.cpp:135 + }; + std::vector<SymbolID> VisitStates(T->Nonterminals.size(), NotVisited); + std::function<void(SymbolID)> DFS = [&](SymbolID SID) -> void { ---------------- (is a vector of symbol id, or of states?) ================ Comment at: clang-tools-extra/pseudo/unittests/TestGrammar.h:18 + +struct TestGrammar { + static TestGrammar build(llvm::StringRef BNF); ---------------- doc ================ Comment at: clang-tools-extra/pseudo/unittests/TestGrammar.h:26 + // The nonterminal symbo is expected to have a single rule in the grammar. + RuleID singleRuleFor(llvm::StringRef NonterminalName) const; + ---------------- I think this puts too much to much emphasis on the "single" and breaks the symmetry with "symbol" - i'd prefer just "ruleFor". (The failure mode here is just that the test crashes 100% of the time, so the risk seems low) ================ Comment at: clang-tools-extra/pseudo/unittests/TestGrammar.h:28 + + std::unique_ptr<Grammar> G; + std::vector<std::string> Diags; ---------------- shouldn't these be private, or have better names, and docs when they're set etc? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D122303/new/ https://reviews.llvm.org/D122303 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits