hokein added inline comments.
================ Comment at: clang/include/clang/Tooling/Syntax/TokenManager.h:37 + + // FIXME: add an interface for getting token kind. +}; ---------------- sammccall wrote: > I wouldn't want to prejudge this: this is a very basic attribute similar to > kind/role, and we may want to store it in Leaf and avoid the virtual stuff. > > There's certainly enough space, e.g. of the current 16-bit `kind` use the top > bit to denote leaf-or-not and the bottom 15 bits to store kind-or-tokenkind. This sounds good. Adjust the FIXME. ================ Comment at: clang/include/clang/Tooling/Syntax/Tokens.h:463 +/// It tracks the underlying token buffers, source manager, etc. +class SyntaxTokenManager : public TokenManager { +public: ---------------- ilya-biryukov wrote: > sammccall wrote: > > conceptually this is just "TokenBuffer implements TokenManager" > > > > The main reason I can see not to actually write that is to avoid the > > dependency from TokenBuffer (tokens library) to TokenManager (syntax > > library). But here you've added that dependency anyway. > > > > So I think we'd be better either with `TokenBuffer : TokenManager` or > > moving this class to its own header. > I would argue they should be separate concepts. > - `TokenBuffer` is about storing tokens and mapping between expanded and > spelled token streams. > - `SyntaxTokenManager` is about implementing relevant certain operations on > `syntax::Token` and hiding the actual token type. > Conceptually those things are different and decoupling them allows for more > flexibility and allows reasoning about them independently. In particular, one > could imagine having a `SyntaxTokenManager` without a `TokenBuffer` if we do > not need to deal with two streams of tokens. > > I would suggest keeping `SyntaxTokenManager` as a separate class. > I would suggest keeping SyntaxTokenManager as a separate class. +1. Moved to a separate file. ================ Comment at: clang/include/clang/Tooling/Syntax/Tokens.h:465 +public: + SyntaxTokenManager(SourceManager &SourceMgr, const LangOptions &LangOpts, + const TokenBuffer &Tokens) ---------------- sammccall wrote: > no need to take SourceManager, TokenBuffer already includes it. > LangOpts is unused here. > no need to take SourceManager, TokenBuffer already includes it. The SourceMgr can be mutated by the class (it stores the underlying tokens for `ExtraTokens`), while the SourceManager in TokenBuffer is immutable. > LangOpts is unused here. It is used in SyntaxTokenManager::lexBuffer. ================ Comment at: clang/include/clang/Tooling/Syntax/Tokens.h:475 + assert(Token); + // Handle 'eof' separately, calling text() on it produces an empty string. + if (Token->kind() == tok::eof) ---------------- sammccall wrote: > Empty string seems like the correct return value here to me. > If you want a special case for dump, I think that belongs in dump(). > > If this is because we currently provide no way to get the token kind, then > this should be a FIXME Yeah, this special case is for the Leaf node dump. Added a FIXME. (I even double whether we need this special case at all, do we really want to build a Leaf node for eof token?) ================ Comment at: clang/include/clang/Tooling/Syntax/Tokens.h:504 + /// IDs and storage for additional tokenized files. + llvm::DenseMap<FileID, std::vector<Token>> ExtraTokens; +}; ---------------- sammccall wrote: > aren't we trying to store these on the syntax arena? > We never do lookups into this map, maybe lexbuffer just allocates storage on > the arena('s allocator) instead of using this map? > aren't we trying to store these on the syntax arena? We could do that, but I'd try to avoid that. Now the allocator of syntax::Arena is a storage for syntax-tree nodes only. Allocation for token-related stuff is on the `SyntaxTokenManager`. > We never do lookups into this map, maybe lexbuffer just allocates storage on > the arena('s allocator) instead of using this map? This map is moved from the syntax::Arena. You're right, there is no usage of the key at the moment, and the only use-case is to create a syntax leaf node that not backed by the source code (for refactoring usecase), it is unclear whether we will use it in the future. I will keep it as-is (this is not the scope of this patch). ================ Comment at: clang/unittests/Tooling/Syntax/TreeTestBase.cpp:116 syntax::TranslationUnit *&Root; + std::unique_ptr<syntax::SyntaxTokenManager> &TM; std::unique_ptr<syntax::TokenBuffer> &TB; ---------------- ilya-biryukov wrote: > NIT: it´s not breaking anything now, but I suggest putting SyntaxTokenManager > after TokenBuffer. > The reason is that it´s the right destruction order, TokenManager has > references to TokenBuffer, so it could potentially access it in destructor > some time in the future (e.g. imagine asserting something on tokens). > > Not that it actually breaks today, but seems like a potential surprising bug > in the future if we happen to refactor code in a certain way. good point! Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D128411/new/ https://reviews.llvm.org/D128411 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits