[PATCH] D72874: [clangd] Add a textual fallback for go-to-definition

Sam McCall via Phabricator via cfe-commits Fri, 06 Mar 2020 04:24:45 -0800

sammccall added a comment.

I'd like to sync up briefly on https://github.com/clangd/clangd/issues/241 so 
we know where we want to end up.


I think this is in good shape and certainly doesn't need a bigger scope, just 
want to be able to reason about how things will fit together.



================
Comment at: clang-tools-extra/clangd/SourceCode.h:93
+/// the entire comment or string token.
+SourceRange getWordAtPosition(const Position &Pos, const SourceManager &SM,
+                              const LangOptions &LangOpts);
----------------
nridge wrote:
> sammccall wrote:
> > sammccall wrote:
> > > consider moving the isLikelyToBeIdentifier check inside. The current API 
> > > is pretty general and it's not clear yet what (else) it's good for so 
> > > it's nice to direct towards intended usage.
> > > 
> > > Also doing the identifier check inside this function is more convenient 
> > > when it relies on markers outside the identifier range (like doxygen `\p` 
> > > or backtick-quoted identifiers)
> > > 
> > > That said, you may still want to return the range when it's not a likely 
> > > identifier, with a signature like `StringRef getWordAtPosition(bool 
> > > *LikelyIdentifier = nullptr)`. I'm thinking of the future case where the 
> > > caller wants to find a nearby matching token and resolve it - resolving 
> > > belongs in the caller so there's not much point having this function 
> > > duplicate the check.
> > This doesn't use the SourceManager-structure of the file, so the natural 
> > signature would be `StringRef getWordAtPosition(StringRef Code, unsigned 
> > Offset)`.
> > 
> > (what are the practical cases where langopts is relevant?)
> Now that I'm using `wordTouching()` from D75479, I think this comment no 
> longer applies?
I think the reasons still apply - D75479 doesn't need to check likelihood (it 
considers actual use as identifier evidence enough) so I didn't include it 
there, but we should eventually merge these more thoroughly I think. No need to 
do that until we actually want to implement different heuristics though.


================
Comment at: clang-tools-extra/clangd/XRefs.cpp:240
+  Req.ProximityPaths = {MainFilePath};
+  // FIXME: Set Req.Scopes to the lexically enclosing scopes.
+  // For extra strictness, consider AnyScope=false.
----------------
nridge wrote:
> sammccall wrote:
> > FWIW the API for this is visibleNamespaces() from SourceCode.cpp.
> > (No enclosing classes, but I suspect we can live without them once we have 
> > a nearby-tokens solution too)
> Thanks, that's convenient!
> 
> Out of curiosity, though: is the reason to prefer this lexer-based approach 
> over hit-testing the query location against `NamespaceDecl`s in the AST, 
> mainly for performance?
Well, it was written for fallback code completion when we have no AST at all :-)

Gathering from the AST should be better, though it's not quite as simple as 
hit-testing (you also have to find `using namespace`). But this exists today, 
which is a feature!


================
Comment at: clang-tools-extra/clangd/XRefs.cpp:258
+
+    // For now, only consider exact name matches, including case.
+    // This is to avoid too many false positives.
----------------
nridge wrote:
> sammccall wrote:
> > I wouldn't bother qualifying this as "for now". Any code is subject to 
> > change in the future, but requiring an exact name match for index-based 
> > results seems more like a design decision than a fixme.
> Do we want to rule out the possibility of handling typos in an identifier 
> name in a comment (in cases where we have high confidence in the match, e.g. 
> a long / unique name, small edit distance, only one potential match) in the 
> future?
> 
> This is also relevant to whether we want to keep the `FuzzyMatcher` or not.
No idea whether typo-correction is a good idea in principle - tradeoff between 
current false negatives and false positives+compute.

However neither FuzzyMatcher nor the existing index implementations support/can 
easily support real typo correction, and it seems implausible to me we'd add it 
for this feature.

Compare to e.g:
 - allowing case-insensitive match in some cases: `fooBar` vs `FooBar` is a 
plausible "typo". This is easy to implement.
 - correct the typo where we see the fixed version used as an identifier in 
this file (and not the original). Excludes some cases, but drives 
false-positives way down, and easy to implement.

I don't think we need to rule things out, but I'm uncertain enough about the 
approach to think that putting comments, fuzzymatcher etc here speculatively 
isn't worth it.


================
Comment at: clang-tools-extra/clangd/XRefs.cpp:489
+  Index->fuzzyFind(Req, [&](const Symbol &Sym) {
+    auto MaybeDeclLoc =
+        symbolLocationToLocation(Sym.CanonicalDeclaration, MainFilePath);
----------------
nridge wrote:
> Sorry this location-setting code is so messy. All my attempts to make it more 
> concise have been thwarted by `llvm::Expected`'s very restrictive API.
Ugh, don't get me started on Error/Expected :-(

I'd love to get rid of it somehow, but it seems like we'd inevitably just end 
up with the new thing + Error/Expected + error_code/ErrorOr + return-a-bool, 
and I'm not sure it'd be better.

(If you have more energy than me, I'd enthusiastically +1 an llvm-dev proposal 
to drop the clever checks from llvm::Error, and I know some others who would...)


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D72874/new/

https://reviews.llvm.org/D72874



_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D72874: [clangd] Add a textual fallback for go-to-definition

Reply via email to