aaron.ballman added inline comments.

================
Comment at: clang/lib/Basic/Sarif.cpp:161
+    Region["endColumn"] = adjustColumnPos(
+        R.getEnd(), Lexer::MeasureTokenLength(R.getEnd().getLocWithOffset(0),
+                                              R.getEnd().getManager(), LO));
----------------
vaibhav.y wrote:
> aaron.ballman wrote:
> > vaibhav.y wrote:
> > > aaron.ballman wrote:
> > > > vaibhav.y wrote:
> > > > > aaron.ballman wrote:
> > > > > > I didn't catch this during the review -- but this is a layering 
> > > > > > violation that caused link errors on some of the build bots. Lexer 
> > > > > > can call into Basic, but Basic cannot call into Lexer. So we'll 
> > > > > > need to find a different way to handle this.
> > > > > Would moving the code to Support, having it depend on Basic & Lex 
> > > > > work?
> > > > I don't think so -- Support is supposed to be a pretty low-level 
> > > > interface; it currently only relies on LLVM's Support library. I think 
> > > > the layering is supposed to be: Support -> Basic -> Lex.
> > > > 
> > > > As I see it, there are a couple of options as to where this could live. 
> > > > It could live in the Frontend library, as that's where all the 
> > > > diagnostic consumer code in Clang lives. But that library might be a 
> > > > bit "heavy" to pull into other tools (maybe? I don't know). It could 
> > > > also live in AST -- that already links in Basic and Lex. But that feels 
> > > > like a somewhat random place for this to live as this has very little 
> > > > to do with the AST itself.
> > > > 
> > > > Another approach, which might be better, is to require the user of this 
> > > > interface to pass in the token length calculation themselves in the 
> > > > places where it's necessary. e.g., `json::Object 
> > > > whatever(SourceLocation Start, SourceLocation End, unsigned EndLen)` 
> > > > and then you can remove the reliance on the lexer entirely while 
> > > > keeping the interface in Basic. I'm not certain how obnoxious this 
> > > > suggestion is, but I think it's my preferred approach for the moment 
> > > > (but not a strongly held position yet). WDYT of this approach?
> > > I think the approach to injecting the function is better here. I've tried 
> > > to make the smallest change possiblew with passing in a function whose 
> > > interface is almost identical to `Lexer::MeasureTokenLength`. The intent 
> > > was to hint at this being the "canonical metric" for token lengths (with 
> > > an example in the tests for the same).
> > > 
> > > I tried passing start, end locs but couldn't find a strong use case yet 
> > > since `end` would likely always be: `Lexer::getLocForEndOfToken(start, 0)`
> > I'm not convinced that the less obtrusive change is a good design in this 
> > case. But I also agree that we should not use start/end *locations* either. 
> > `SourceLocation` traditionally points to the *start* of a token, so it 
> > would be super easy to get the `end` location wrong by forgetting to pass 
> > the location for the end of the token.
> > 
> > My suggestion was to continue to pass the start of the starting token, the 
> > start of the ending token, and the length of the ending token. With the 
> > callback approach, you have to call through the callback to eventually call 
> > `Lexer::MeasureTokenLength()`; with the direct approach, you skip needing 
> > to call through a callback (which means at least one less function call on 
> > every source location operation).
> Ah, I think I misunderstood your initial suggestion (`json::Object 
> whatever(SourceLocation Start, SourceLocation End, unsigned EndLen)`) seemed 
> like a function call to me, when it seems the suggested change was to pass in 
> an object? Apologies, will fix that up.
Sorry for the confusion! Just to make sure we're on the same page -- my 
suggestion was to change the function interfaces like 
`SarficDocumentWriter::createPhysicalLocation()` so that they would take an 
additional `unsigned EndLen` parameter.

However, now that I dig around a bit, it seems like `CharSourceRange` is what 
you'd want to use there -- then you can assert that what you're given is a char 
range and not a token range. So you won't need the `unsigned EndLen` parameter 
after all!


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D109701/new/

https://reviews.llvm.org/D109701

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to