On Wed, Jan 6, 2016 at 4:22 PM, Bruno Cardoso Lopes <bruno.card...@gmail.com > wrote:
> bruno updated this revision to Diff 44180. > bruno added a comment. > > Hi Richard, > > Thanks for the comments. Updated the patch! > > In http://reviews.llvm.org/D15173#313235, @rsmith wrote: > > > I think that this will leave us with a broken token stream. In your > example, the cached token stream starts as > > > > `NSArray` `<` `id` `<` `PB` `>>` `*` [...] > > > > > > ... and we try to annotate the `id<PB>` with our `CachedLexPos` pointing > at the `*` token. That leaves `CachedTokens` containing: > > > > `NSArray` `<` `(type annotation)` `*` [...] > > > > > > ... which is wrong. We need to actually convert the > `tok::greatergreater` in `CachedTokens` into a `tok::greater` and update > its location and length in order for the cached token stream to be > correctly updated. Otherwise if the parser backtracks it will see the wrong > token stream. > > > I don't think this happens, the type annotation starts at 7:11 and ends at > 7:24: > identifier 'NSArray' Loc=<testcase.mm:7:11> > greatergreater '>>' Loc=<testcase.mm:7:24> > > The code that follows the assert then patches the CachedTokens the correct > way, see the CachedTokens before and after: > This example doesn't show the above problem, because it's annotating the NSArray<...> type, not the id<...> type. On reflection, the problem that I'm concerned about won't actually trigger the assertion here (except in weird and rare cases like annotating the middle > in a >>> token), but it's really caused by the same underlying bug. In practice, things will go wrong in two ways when we want to annotate half of a split token such as `>>`: 1) If the annotation ends in the middle of a split token, we'll currently annotate the entire token and lose the second half if we backtrack 2) If the annotation ends at the end of a split token, we'll assert because the start of the final token is not on a token boundary in the cached token stream The underlying bug is that splitting a token is not properly updating the token cache. Assuming we never want to split a token, then backtrack over the token, then choose to not split it, the best fix would seem to be to update the cached token stream at the point when we perform the split (replace the tok::greatergreater with two tok::greaters in CachedTokens). And I think that's a correct assumption, as we only consider splitting when we know we're parsing a template argument list, and in that context the token is always split. > - Before: > > (clang::Preprocessor::CachedTokensTy) $32 = { > > [0] = (Loc = 89, UintData = 1, PtrData = 0x0000000000000000, Kind = > l_paren, Flags = 0) > [1] = (Loc = 90, UintData = 7, PtrData = 0x000000010d82fba0, Kind = > identifier, Flags = 0) > [2] = (Loc = 97, UintData = 1, PtrData = 0x0000000000000000, Kind = > less, Flags = 0) > [3] = (Loc = 98, UintData = 2, PtrData = 0x000000010d01da58, Kind = > identifier, Flags = 0) > [4] = (Loc = 100, UintData = 1, PtrData = 0x0000000000000000, Kind = > less, Flags = 0) > [5] = (Loc = 101, UintData = 2, PtrData = 0x000000010d82fb70, Kind = > identifier, Flags = 0) > [6] = (Loc = 103, UintData = 2, PtrData = 0x0000000000000000, Kind = > greatergreater, Flags = 0) > [7] = (Loc = 106, UintData = 1, PtrData = 0x0000000000000000, Kind = > star, Flags = 2) > > } > > - After: > > (clang::Preprocessor::CachedTokensTy) $34 = { > > [0] = (Loc = 89, UintData = 1, PtrData = 0x0000000000000000, Kind = > l_paren, Flags = 0) > [1] = (Loc = 90, UintData = 104, PtrData = 0x000000010d820660, Kind = > annot_typename, Flags = 0) > [2] = (Loc = 106, UintData = 1, PtrData = 0x0000000000000000, Kind = > star, Flags = 2) > > } > > > http://reviews.llvm.org/D15173 > > Files: > lib/Lex/PPCaching.cpp > test/Parser/objcxx11-protocol-in-template.mm > > Index: test/Parser/objcxx11-protocol-in-template.mm > =================================================================== > --- test/Parser/objcxx11-protocol-in-template.mm > +++ test/Parser/objcxx11-protocol-in-template.mm > @@ -8,3 +8,11 @@ > > vector<id<P>> v; > vector<vector<id<P>>> v2; > + > +@protocol PA; > +@protocol PB; > + > +@class NSArray<ObjectType>; > +typedef int some_t; > + > +id<PA> FA(NSArray<id<PB>> *h, some_t group); > Index: lib/Lex/PPCaching.cpp > =================================================================== > --- lib/Lex/PPCaching.cpp > +++ lib/Lex/PPCaching.cpp > @@ -97,13 +97,33 @@ > void Preprocessor::AnnotatePreviousCachedTokens(const Token &Tok) { > assert(Tok.isAnnotation() && "Expected annotation token"); > assert(CachedLexPos != 0 && "Expected to have some cached tokens"); > - assert(CachedTokens[CachedLexPos-1].getLastLoc() == > Tok.getAnnotationEndLoc() > - && "The annotation should be until the most recent cached > token"); > + > +#ifndef NDEBUG > + { > + Token CachedLastTok = CachedTokens[CachedLexPos - 1]; > + SourceLocation CachedLastTokLoc = CachedLastTok.getLastLoc(); > + SourceLocation TokAnnEndLoc = Tok.getAnnotationEndLoc(); > + > + // The annotation should be until the most recent cached token. Since > + // `Tok` length could be bigger than one (e.g. greatergreater '>>'), > account > + // for that case before checking the assertion. > + if (CachedLastTokLoc != TokAnnEndLoc && > !CachedLastTok.isAnnotation()) { > + CachedLastTokLoc = > + CachedLastTokLoc.getLocWithOffset(CachedLastTok.getLength()); > + unsigned TokAnnEndLocSize = Lexer::MeasureTokenLength( > + SourceMgr.getSpellingLoc(TokAnnEndLoc), SourceMgr, LangOpts); > + TokAnnEndLoc = TokAnnEndLoc.getLocWithOffset(TokAnnEndLocSize); > + } > + > + assert(CachedLastTokLoc == TokAnnEndLoc && > + "The annotation should be until the most recent cached token"); > + } > +#endif > > // Start from the end of the cached tokens list and look for the token > // that is the beginning of the annotation token. > for (CachedTokensTy::size_type i = CachedLexPos; i != 0; --i) { > - CachedTokensTy::iterator AnnotBegin = CachedTokens.begin() + i-1; > + CachedTokensTy::iterator AnnotBegin = CachedTokens.begin() + i - 1; > if (AnnotBegin->getLocation() == Tok.getLocation()) { > assert((BacktrackPositions.empty() || BacktrackPositions.back() < > i) && > "The backtrack pos points inside the annotated tokens!"); > > >
_______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits