On Wed, Jan 6, 2016 at 4:22 PM, Bruno Cardoso Lopes <bruno.card...@gmail.com
> wrote:

> bruno updated this revision to Diff 44180.
> bruno added a comment.
>
> Hi Richard,
>
> Thanks for the comments. Updated the patch!
>
> In http://reviews.llvm.org/D15173#313235, @rsmith wrote:
>
> > I think that this will leave us with a broken token stream. In your
> example, the cached token stream starts as
> >
> >   `NSArray` `<` `id` `<` `PB` `>>` `*` [...]
> >
> >
> > ... and we try to annotate the `id<PB>` with our `CachedLexPos` pointing
> at the `*` token. That leaves `CachedTokens` containing:
> >
> >   `NSArray` `<` `(type annotation)` `*` [...]
> >
> >
> > ... which is wrong. We need to actually convert the
> `tok::greatergreater` in `CachedTokens` into a `tok::greater` and update
> its location and length in order for the cached token stream to be
> correctly updated. Otherwise if the parser backtracks it will see the wrong
> token stream.
>
>
> I don't think this happens, the type annotation starts at 7:11 and ends at
> 7:24:
> identifier 'NSArray'            Loc=<testcase.mm:7:11>
> greatergreater '>>'             Loc=<testcase.mm:7:24>
>
> The code that follows the assert then patches the CachedTokens the correct
> way, see the CachedTokens before and after:
>

This example doesn't show the above problem, because it's annotating the
NSArray<...> type, not the id<...> type. On reflection, the problem that
I'm concerned about won't actually trigger the assertion here (except in
weird and rare cases like annotating the middle > in a >>> token), but it's
really caused by the same underlying bug. In practice, things will go wrong
in two ways when we want to annotate half of a split token such as `>>`:

1) If the annotation ends in the middle of a split token, we'll currently
annotate the entire token and lose the second half if we backtrack
2) If the annotation ends at the end of a split token, we'll assert because
the start of the final token is not on a token boundary in the cached token
stream

The underlying bug is that splitting a token is not properly updating the
token cache. Assuming we never want to split a token, then backtrack over
the token, then choose to not split it, the best fix would seem to be to
update the cached token stream at the point when we perform the split
(replace the tok::greatergreater with two tok::greaters in CachedTokens).
And I think that's a correct assumption, as we only consider splitting when
we know we're parsing a template argument list, and in that context the
token is always split.


> - Before:
>
> (clang::Preprocessor::CachedTokensTy) $32 = {
>
>   [0] = (Loc = 89, UintData = 1, PtrData = 0x0000000000000000, Kind =
> l_paren, Flags = 0)
>   [1] = (Loc = 90, UintData = 7, PtrData = 0x000000010d82fba0, Kind =
> identifier, Flags = 0)
>   [2] = (Loc = 97, UintData = 1, PtrData = 0x0000000000000000, Kind =
> less, Flags = 0)
>   [3] = (Loc = 98, UintData = 2, PtrData = 0x000000010d01da58, Kind =
> identifier, Flags = 0)
>   [4] = (Loc = 100, UintData = 1, PtrData = 0x0000000000000000, Kind =
> less, Flags = 0)
>   [5] = (Loc = 101, UintData = 2, PtrData = 0x000000010d82fb70, Kind =
> identifier, Flags = 0)
>   [6] = (Loc = 103, UintData = 2, PtrData = 0x0000000000000000, Kind =
> greatergreater, Flags = 0)
>   [7] = (Loc = 106, UintData = 1, PtrData = 0x0000000000000000, Kind =
> star, Flags = 2)
>
> }
>
> - After:
>
> (clang::Preprocessor::CachedTokensTy) $34 = {
>
>   [0] = (Loc = 89, UintData = 1, PtrData = 0x0000000000000000, Kind =
> l_paren, Flags = 0)
>   [1] = (Loc = 90, UintData = 104, PtrData = 0x000000010d820660, Kind =
> annot_typename, Flags = 0)
>   [2] = (Loc = 106, UintData = 1, PtrData = 0x0000000000000000, Kind =
> star, Flags = 2)
>
> }
>
>
> http://reviews.llvm.org/D15173
>
> Files:
>   lib/Lex/PPCaching.cpp
>   test/Parser/objcxx11-protocol-in-template.mm
>
> Index: test/Parser/objcxx11-protocol-in-template.mm
> ===================================================================
> --- test/Parser/objcxx11-protocol-in-template.mm
> +++ test/Parser/objcxx11-protocol-in-template.mm
> @@ -8,3 +8,11 @@
>
>  vector<id<P>> v;
>  vector<vector<id<P>>> v2;
> +
> +@protocol PA;
> +@protocol PB;
> +
> +@class NSArray<ObjectType>;
> +typedef int some_t;
> +
> +id<PA> FA(NSArray<id<PB>> *h, some_t group);
> Index: lib/Lex/PPCaching.cpp
> ===================================================================
> --- lib/Lex/PPCaching.cpp
> +++ lib/Lex/PPCaching.cpp
> @@ -97,13 +97,33 @@
>  void Preprocessor::AnnotatePreviousCachedTokens(const Token &Tok) {
>    assert(Tok.isAnnotation() && "Expected annotation token");
>    assert(CachedLexPos != 0 && "Expected to have some cached tokens");
> -  assert(CachedTokens[CachedLexPos-1].getLastLoc() ==
> Tok.getAnnotationEndLoc()
> -         && "The annotation should be until the most recent cached
> token");
> +
> +#ifndef NDEBUG
> +  {
> +    Token CachedLastTok = CachedTokens[CachedLexPos - 1];
> +    SourceLocation CachedLastTokLoc = CachedLastTok.getLastLoc();
> +    SourceLocation TokAnnEndLoc = Tok.getAnnotationEndLoc();
> +
> +    // The annotation should be until the most recent cached token. Since
> +    // `Tok` length could be bigger than one (e.g. greatergreater '>>'),
> account
> +    // for that case before checking the assertion.
> +    if (CachedLastTokLoc != TokAnnEndLoc &&
> !CachedLastTok.isAnnotation()) {
> +      CachedLastTokLoc =
> +          CachedLastTokLoc.getLocWithOffset(CachedLastTok.getLength());
> +      unsigned TokAnnEndLocSize = Lexer::MeasureTokenLength(
> +          SourceMgr.getSpellingLoc(TokAnnEndLoc), SourceMgr, LangOpts);
> +      TokAnnEndLoc = TokAnnEndLoc.getLocWithOffset(TokAnnEndLocSize);
> +    }
> +
> +    assert(CachedLastTokLoc == TokAnnEndLoc &&
> +           "The annotation should be until the most recent cached token");
> +  }
> +#endif
>
>    // Start from the end of the cached tokens list and look for the token
>    // that is the beginning of the annotation token.
>    for (CachedTokensTy::size_type i = CachedLexPos; i != 0; --i) {
> -    CachedTokensTy::iterator AnnotBegin = CachedTokens.begin() + i-1;
> +    CachedTokensTy::iterator AnnotBegin = CachedTokens.begin() + i - 1;
>      if (AnnotBegin->getLocation() == Tok.getLocation()) {
>        assert((BacktrackPositions.empty() || BacktrackPositions.back() <
> i) &&
>               "The backtrack pos points inside the annotated tokens!");
>
>
>
_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to