aaron.ballman added a subscriber: erichkeane.
aaron.ballman added inline comments.


================
Comment at: clang/test/Preprocessor/suggest-typoed-directive.c:10
+// expected-warning@+11 {{'#elfidef' directive not found, did you mean 
'#elifdef'?}}
+// expected-warning@+11 {{'#elfindef' directive not found, did you mean 
'#elifdef'?}}
+// expected-warning@+11 {{'#elsi' directive not found, did you mean '#else'?}}
----------------
ken-matsui wrote:
> aaron.ballman wrote:
> > ken-matsui wrote:
> > > aaron.ballman wrote:
> > > > It's interesting that this one suggested `#elifdef` instead of 
> > > > `#elifndef` -- is there anything that can be done for that?
> > > > 
> > > > Also, one somewhat interesting question is whether we want to recommend 
> > > > `#elifdef` and `#elifndef` outside of C2x/C++2b mode. Those directives 
> > > > only exist in the latest language standard, but Clang supports them as 
> > > > a conforming extension in all language modes. Given that this 
> > > > diagnostic is about typos, I think I'm okay suggesting the directives 
> > > > even in older language modes. That's as likely to be a correct 
> > > > suggestion as not, IMO.
> > > > It's interesting that this one suggested `#elifdef` instead of 
> > > > `#elifndef` -- is there anything that can be done for that?
> > > 
> > > I found I have to use `std::min_element` instead of `std::max_element` 
> > > because we are finding the nearest (most minimum distance) string. 
> > > Meanwhile, `#elfindef` has 2 distance with both `#elifdef` and 
> > > `#elifndef`. After replacing `std::max_element` with `std::min_element`, 
> > > I could suggest `#elifndef` from `#elfinndef`.
> > > 
> > > > Also, one somewhat interesting question is whether we want to recommend 
> > > > `#elifdef` and `#elifndef` outside of C2x/C++2b mode. Those directives 
> > > > only exist in the latest language standard, but Clang supports them as 
> > > > a conforming extension in all language modes. Given that this 
> > > > diagnostic is about typos, I think I'm okay suggesting the directives 
> > > > even in older language modes. That's as likely to be a correct 
> > > > suggestion as not, IMO.
> > > 
> > > I agree with you because Clang implements those directives, and the 
> > > suggested code will also be compilable. I personally think only not 
> > > compilable suggestions should be avoided. (Or, we might place additional 
> > > info for outside of C2x/C++2b mode like `this is a C2x/C++2b feature but 
> > > compilable on Clang`?)
> > > 
> > > ---
> > > 
> > > According to the algorithm of `std::min_element`, we only get an iterator 
> > > of the first element even if there is another element that has the same 
> > > distance. So, `#elfindef` only suggests `#elifdef` in accordance with the 
> > > order of `Candidates`, and I don't think it is beautiful to depend on the 
> > > order of candidates. I would say that we can suggest all the same 
> > > distance like the following, but I'm not sure this is preferable:
> > > 
> > > ```
> > > #elfindef // diag: unknown directive, did you mean #elifdef or #elifndef?
> > > ```
> > > 
> > > I agree with you because Clang implements those directives, and the 
> > > suggested code will also be compilable. I personally think only not 
> > > compilable suggestions should be avoided. (Or, we might place additional 
> > > info for outside of C2x/C++2b mode like this is a C2x/C++2b feature but 
> > > compilable on Clang?)
> > 
> > I may be changing my mind on this a bit. I now see we don't issue an 
> > extension warning when using `#elifdef` or `#elifndef` in older language 
> > modes. That means suggesting those will be correct but only for Clang, and 
> > the user won't have any other diagnostics to tell them about the 
> > portability issue. And those particular macros are reasonably likely to be 
> > used in a header where the user is trying to aim for portability. So I'm 
> > starting to think we should only suggest those two in C2x mode (and we 
> > should probably add a portability warning for #elifdef and #elifndef, so I 
> > filed: https://github.com/llvm/llvm-project/issues/55306)
> > 
> > > I would say that we can suggest all the same distance like the following, 
> > > but I'm not sure this is preferable:
> > 
> > The way we typically handle this is to attach FixIt hints to a note instead 
> > of to the diagnostic. This way, automatic fixes aren't applied (because 
> > there are multiple choices to pick from) but the user is still able to 
> > apply whichever fix they want in an IDE or other tool. It might be worth 
> > trying that approach (e.g., if there's only one candidate, attach it to the 
> > warning, and if there are two or more, emit a warning without a "did you 
> > mean" in it and use a new note for the fixit. e.g.,
> > ```
> > #elfindef // expected-warning {{invalid preprocessing directive}} \
> >              expected-note {{did you mean '#elifdef'?}} \
> >              expected-note {{did you mean '#elifndef'?}}
> > ```
> > WDYT?
> > I may be changing my mind on this a bit. I now see we don't issue an 
> > extension warning when using `#elifdef` or `#elifndef` in older language 
> > modes. That means suggesting those will be correct but only for Clang, and 
> > the user won't have any other diagnostics to tell them about the 
> > portability issue. And those particular macros are reasonably likely to be 
> > used in a header where the user is trying to aim for portability. So I'm 
> > starting to think we should only suggest those two in C2x mode (and we 
> > should probably add a portability warning for #elifdef and #elifndef, so I 
> > filed: https://github.com/llvm/llvm-project/issues/55306)
> >
> 
> Certainly, it would be less confusing to the user to avoid suggestions 
> regarding those two.
> I'm going to fix my code to avoid suggesting them in not C2x mode.
> 
> Thank you for submitting the issue, I also would like to work on it.
> 
> > The way we typically handle this is to attach FixIt hints to a note instead 
> > of to the diagnostic. This way, automatic fixes aren't applied (because 
> > there are multiple choices to pick from) but the user is still able to 
> > apply whichever fix they want in an IDE or other tool. It might be worth 
> > trying that approach (e.g., if there's only one candidate, attach it to the 
> > warning, and if there are two or more, emit a warning without a "did you 
> > mean" in it and use a new note for the fixit. e.g.,
> > ```
> > #elfindef // expected-warning {{invalid preprocessing directive}} \
> >              expected-note {{did you mean '#elifdef'?}} \
> >              expected-note {{did you mean '#elifndef'?}}
> > ```
> > WDYT?
> 
> This is really cool, but don't you care about the redundancy of `did you 
> mean` in terms of the llvm team culture?
> If not, I will implement notes like the above.
> Certainly, it would be less confusing to the user to avoid suggestions 
> regarding those two. I'm going to fix my code to avoid suggesting them in not 
> C2x mode.

+1, thank you!

> This is really cool, but don't you care about the redundancy of did you mean 
> in terms of the llvm team culture? If not, I will implement notes like the 
> above.

I would care if the list were potentially unbounded (like, say, with 
identifiers in general), but because we know this list will only have a max of 
two entries on it in this case, it seems reasonable to me. I double-checked 
with @erichkeane to see if he thought it would be an issue, and he agreed that 
it being a fixed list makes it pretty reasonable.

However, that discussion did raise a question -- why are there two suggestions? 
elifdef requires a swap + delete and elifndef requires just a swap, so we would 
have thought that it would have been the only option in the list.


================
Comment at: llvm/include/llvm/ADT/StringRef.h:267
+      for (StringRef C : Candidates) {
+        size_t CurDist = edit_distance(C, false);
+        if (CurDist <= MaxDist) {
----------------
ken-matsui wrote:
> It seems my previous comment that I posted on deleted file disappeared.
> 
> Could you please check the following link comment?
> 
> https://reviews.llvm.org/D124726#3493222
Thank you for calling this out, I had missed it! I'll respond here.

> Should I keep going with this implementation or not?

I don't think we should add another implementation of Levenshtein distances; we 
should stick with the base functionality, which is what's used by the existing 
typo correction logic: 
https://github.com/llvm/llvm-project/blob/main/clang/lib/Sema/SemaLookup.cpp#L4308

Ideally (but I'm not asking you to do this because it's likely a very big ask), 
we'd generalize the `TypoCorrectionConsumer` and related classes so that it can 
be used during Lex, Parse, or Sema with callbacks from the typo correction 
consumer to obtain the list of potential names for correction. However, that 
functionality is so tightly tied to Sema, it may not be particularly plausible.

One thing I noticed is that your implementation does not allow replacements, 
while the typo correction does. I think that likely explains the behavioral 
difference you're seeing between implementations.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124726/new/

https://reviews.llvm.org/D124726

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to