sammccall added inline comments.
================ Comment at: clang/lib/Tooling/Syntax/Tokens.cpp:216 + llvm::dbgs() + << "$[collect-tokens]: " + << syntax::Token(T).dumpForTests(SourceMgr) << "\n" ---------------- what's "$[collect-tokens]"? Maybe just "Token: "? ================ Comment at: clang/lib/Tooling/Syntax/Tokens.cpp:224 + +TokenBuffer TokenCollector::consume() && { + TokenBuffer B(SourceMgr); ---------------- this function is now >100 lines long. Can we split it up? ================ Comment at: clang/lib/Tooling/Syntax/Tokens.cpp:225 +TokenBuffer TokenCollector::consume() && { + TokenBuffer B(SourceMgr); + for (unsigned I = 0; I < Expanded.size(); ++I) { ---------------- this could be a member, which would help splitting up ================ Comment at: clang/lib/Tooling/Syntax/Tokens.cpp:226 + TokenBuffer B(SourceMgr); + for (unsigned I = 0; I < Expanded.size(); ++I) { + auto FID = ---------------- Is there a reason we do this at the end instead of as tokens are collected? ================ Comment at: clang/lib/Tooling/Syntax/Tokens.cpp:243 + llvm::DenseMap<FileID, unsigned> NextSpelled; + auto ConsumeSpelledUntil = [&](TokenBuffer::MarkedFile &File, + SourceLocation L) { ---------------- consumespelleduntil and fillgapuntil could be methods, I think ================ Comment at: clang/lib/Tooling/Syntax/Tokens.cpp:277 + for (unsigned I = 0; I < Expanded.size() - 1; ++I) { + auto L = Expanded[I].location(); + if (L.isFileID()) { ---------------- the body here could be a method too, I think i.e. for each (expanded token), process it? ================ Comment at: clang/lib/Tooling/Syntax/Tokens.cpp:322 + + // Some files might have unaccounted spelled tokens at the end. + for (auto &F : B.Files) { ---------------- and similarly this section ================ Comment at: clang/lib/Tooling/Syntax/Tokens.cpp:130 + OS << llvm::formatv( + " ['{0}'_{1}, '{2}'_{3}) => ['{4}'_{5}, '{6}'_{7})\n", + PrintToken(File.SpelledTokens[M.BeginSpelled]), M.BeginSpelled, ---------------- sammccall wrote: > As predicted :-) I think these `_<index>` suffixes are a maintenance hazard. > > In practice, the assertions are likely to add them by copy-pasting the test > output. > > They guard against a class of error that doesn't seem very likely, and in > practice they don't even really prevent it (because of the copy/paste issue). > > I'd suggest just dropping them, and living with the test assertions being > slightly ambiguous. > > Failing that, some slightly trickier formatting could give more context: > > `A B C D E F` --> `A B ... E F` > > With special cases for smaller numbers of tokens. I don't like the > irregularity of that approach though. (this one is still open) ================ Comment at: clang/unittests/Tooling/Syntax/TokensTest.cpp:622 + +// FIXME: add tests for mapping spelled tokens into offsets. + ---------------- sammccall wrote: > please fix :-) (still missing?) Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D59887/new/ https://reviews.llvm.org/D59887 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits