sammccall accepted this revision.
sammccall added a comment.

Sorry about the delay, I've been juggling too many things :-(

TL;DR: yes, I think it's reasonable. I think the implementation complexity is 
justified by what we get. Thanks for explaining!

I think to minimize interface complexity, we should basically consider this to 
just weaken the invariants of TypedefType etc, and have comments reflecting 
that. Accordingly like to see `inDivergent()` inverted and named something like 
`typeMatchesDecl()` if it needs to be exposed at all.

But since this is just naming nits and I'm having trouble staying responsive, 
LGTM from my side once you feel it's clear enough.

---

In D133468#3804290 <https://reviews.llvm.org/D133468#3804290>, @mizvekov wrote:

>   template <class T> struct A { using type1 = T; };
>   using Int = int;
>   
>   using type2 = A<Int>::type1;
>
> [...] We want the underlying type of `type2` to have that `Int` sugar. [...] 
> The problem here is that the TypedefType is currently limited to having the 
> declaration's underlying type, which is just from an instantiation of 
> `A<int>`, and knows nothing about camel case `Int`.

Thanks, this is compelling! Couple of observations, not objections:

- applications often print the fully-sugared type (`type2`) and/or the 
canonical type (`int`) with nothing in between. This might limit the benefits - 
I've often wished we had better & sharable heuristics for how much desugaring 
users want.
- in this simple example we could naively solve the problem by having the class 
instantiation that it was instantiated with `Int` sugar, and making the 
original underlying type of A<int>::type1 as `Int`. (There are downsides here, 
you can obviously get the "wrong" sugar if there are multiple instantiation 
sites).

Anyway, I don't really want to argue scope/design, you've clearly thought way 
more about this, just want to give you more to think about. Question for this 
patch is just cost vs benefits.

> This is similar to the infamous problem with `std::string` turning into 
> `std::basic_string` in templates.

Yes! Though half of my `basic_string` sugar problems are things like `substr`'s 
return type. This seems harder to solve in a principled way, though maybe 
heuristically doable when the written return type uses the injected-class-name? 
Anyway, again offtopic for this patch...

> We could emit a new TypedefDecl with the underlying type that we want, but 
> creating declarations is expensive, so we want to avoid that.
> We could create a new AST node that represented more accurately what was 
> happening. But the question is, is this worth the cost?
> Do you see use cases for this yourself?

From the API here, "divergent" exposed as a property etc, I assumed you 
expected some clients to handle it. And I also expected that it would end up 
being in more nodes than TypedefType and UsingType - it's hard to see why it 
stops there.
In that case an additional node could improve code complexity: 
isolates+centralizes the code handling this case, makes it harder to miss one 
kind of divergent node, keeps the rest of the AST simpler for code that doesn't 
care about resugaring.

But it sounds more like we expect ~no programmatic clients to care, and that 
this change is mostly:

- drop the assumption that decl & type have the same sugar, because of 
resugaring
- add isDivergent() just to enable the text dump to provide hints to human 
analysis

> But the question is, is this worth the cost?

If you're talking about *runtime* cost - it's not obvious that it's 
significantly more expensive? IIUC wherever types are resugared we're having to 
duplicate the chain of sugar (e.g. a function that returns T** is going to be 
instantiated as functiontype(pointertype(pointertype(typedeftype(int)))) and 
then resugared as 
functiontype(pointertype(pointertype(/*divergent*/typedeftype(typedeftype(int))))).
 The difference between enlarging typedeftype with trailing objects vs adding 
an extra node feels like a drop in the bucket. That said, I can't think of a 
way to do this that's both compact *and* much simpler, and simplicity would be 
the point.

(I'm not really asking you to measure this: I know it's a bunch of work. Just 
want to make sure that if we're deciding about cheap vs expensive we're probing 
our intuition a bit)

> We want resugaring to be cheap and introduce as little changes in the AST as 
> possible, to get a better chance of affording to have it to be on all the 
> time.

Yes. In fact I've been scared by the discussion of "maybe we should have flags 
to ramp up/down sugar" as it's hard enough to write AST-wrangling code 
correctly against a single configuration :-)

> It does not seem too far fetched that the UsingType could point to a 
> resugared version of the declaration's underlying type, without having to 
> create a new declaration.

OK, I think I'm sold on this.
I think it's conceptually simpler to say "this may be the case" (weaken 
invariants) than say "there are two flavors of TypedefType: strong-invarant and 
weak-invariant". Divergence is then just an implementation detail (IOW 
non-divergence is a storage optimization).
I don't have a strong opinion on whether this *only* means change the emphasis 
of the documentation, or we should also remove the `isDivergent()` accessor. If 
there's another way to achieve the debugging goals I'd lean toward the latter.



================
Comment at: clang/include/clang/AST/Type.h:4492
   UsingShadowDecl *getFoundDecl() const { return Found; }
+  bool isDivergent() const { return UsingBits.isDivergent; }
   QualType getUnderlyingType() const;
----------------
IIUC clients are *not* expected to care about this.
This isn't obvious, I think we should do one of:
 - remove the accessor
 - make it private and friend the things that need it
 - add a comment that makes it clear this is a detail (e.g. describe it as a 
storage optimization for simple cases)

I think it's worth inverting the boolean sense, e.g. call it 
`UsesDeclaredType`: if clients should assume in general that these have 
different sugar, it's when they *match* that's the special case.


================
Comment at: clang/include/clang/AST/Type.h:4493
+  bool isDivergent() const { return UsingBits.isDivergent; }
   QualType getUnderlyingType() const;
 
----------------
I think getUnderlyingType() is the right place to document this new wrinkle in 
the AST.

Something like:
```
The returned type might have different sugar than the `getFoundDecl()`.
This occurs e.g. when we "resugar" types accessed through template 
instantiations.
```


================
Comment at: clang/lib/AST/ASTContext.cpp:4656
+  }
+  if (Underlying.isNull() || Decl->getUnderlyingType() == Underlying)
+    return QualType(Decl->TypeForDecl, 0);
----------------
Probably worth a comment somewhere in this function: we can have different 
underlying sugar types for the same decl, TypeForDecl points to the one whose 
sugar matches the decl.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D133468/new/

https://reviews.llvm.org/D133468

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to