[PATCH] D135472: [ODRHash] Hash attributes on declarations.

Aaron Ballman via Phabricator via cfe-commits Thu, 13 Oct 2022 08:17:54 -0700

aaron.ballman added inline comments.


================
Comment at: clang/lib/AST/ODRHash.cpp:479-480
+
+  llvm::copy_if(D->attrs(), std::back_inserter(HashableAttrs),
+                [](const Attr *A) { return !A->isImplicit(); });
+}
----------------
ChuanqiXu wrote:
> aaron.ballman wrote:
> > ChuanqiXu wrote:
> > > vsapsai wrote:
> > > > vsapsai wrote:
> > > > > erichkeane wrote:
> > > > > > aaron.ballman wrote:
> > > > > > > ChuanqiXu wrote:
> > > > > > > > vsapsai wrote:
> > > > > > > > > I'm not sure `isImplicit` is the best indicator of attributes 
> > > > > > > > > to check, so suggestions in this area are welcome. I think we 
> > > > > > > > > can start strict and relax some of the checks if needed.
> > > > > > > > > 
> > > > > > > > > If people have strong opinions that some attributes shouldn't 
> > > > > > > > > be ignored, we can add them to the tests to avoid 
> > > > > > > > > regressions. Personally, I believe that alignment and packed 
> > > > > > > > > attributes should never be silently ignored.
> > > > > > > > Agreed. I feel `isImplicit` is enough for now.
> > > > > > > The tricky part is -- sometimes certain attributes add additional 
> > > > > > > implicit attributes and those implicit attributes matter 
> > > > > > > (https://github.com/llvm/llvm-project/blob/main/clang/lib/Sema/SemaDeclAttr.cpp#L9380).
> > > > > > >  And some attributes seem to just do the wrong thing entirely: 
> > > > > > > https://github.com/llvm/llvm-project/blob/main/clang/lib/Sema/SemaDeclAttr.cpp#L7344
> > > > > > > 
> > > > > > > So I think `isImplicit()` is a good approximation, but I'm more 
> > > > > > > wondering what the principle is for whether an attribute should 
> > > > > > > or should not be considered part of the ODR hash. Type 
> > > > > > > attributes, attributes that impact object layout, etc all seem 
> > > > > > > like they should almost certainly be part of ODR hashing. Others 
> > > > > > > are a bit more questionable though.
> > > > > > > 
> > > > > > > I think this is something that may need a per-attribute flag in 
> > > > > > > Attr.td so attributes can opt in or out of it because I'm not 
> > > > > > > certain what ODR issues could stem from `[[maybe_unused]]` or 
> > > > > > > `[[deprecated]]` disagreements across module boundaries.
> > > > > > I don't think 'isImplicit' is particularly good.  I think the idea 
> > > > > > of auto-adding 'type' attributes and having the 'rest' be analyzed 
> > > > > > to figure out which are important.
> > > > > > 
> > > > > > Alternatively, I wonder if we're better off just adding ALL 
> > > > > > attributes and seeing what the fallout is. We can later decide when 
> > > > > > we don't want them to be ODR significant (which, might be OTHERWISE 
> > > > > > meaningful later!).
> > > > > One criteria to decide which attributes should be hashed is if they 
> > > > > affect IRGen. But that's not exhaustive and I'm not sure how 
> > > > > practical it is.
> > > > > 
> > > > > The rule I'm trying to follow right now is if declarations with 
> > > > > different attributes can be substituted instead of each other. For 
> > > > > example, structs with different memory layout cannot be interchanged, 
> > > > > so it is reasonable to reject them.
> > > > > 
> > > > > But maybe we should review attributes on case-by-case basis. For 
> > > > > example, for `[[deprecated]]` I think the best for developers is not 
> > > > > to complain about it but merge the attributes and then have regular 
> > > > > diagnostic about using a deprecated entity.
> > > > One option was to hash `isa<InheritedAttr>` attributes as they are 
> > > > "sticky" and should be more significant, so shouldn't be ignored. But I 
> > > > don't think this argument is particularly convincing.
> > > > 
> > > > At this stage I think we can still afford to add all attributes and 
> > > > exclude some as-needed because modules adoption is still limited. 
> > > > Though I don't have strong feelings about this approach, just getting 
> > > > more restrictive later can be hard.
> > > It looks not a bad idea to add all attributes as experiments, @vsapsai 
> > > how do you feel about this?
> > My intuition is that we're going to want this to be controlled from Attr.td 
> > on a case-by-case basis and to automatically generate the ODR hash code for 
> > attribute arguments.
> > 
> > We can either force every attribute to decide explicitly (this seems pretty 
> > disruptive as a final design but would be a really good way to ensure we 
> > audit all attributes if done as an experiment), or we can pick a default 
> > behavior to ODR hash/not. I suspect we're going to want to pick a default 
> > behavior based on whatever the audit tells us the most common situation is.
> > 
> > I think we're going to need something a bit more nuanced than "this 
> > attribute matters for ODR" because there are attribute arguments that don't 
> > always contribute to the entity (for example, we have "fake" arguments that 
> > are used to give the semantic attribute more information, there may also be 
> > cases where one argument matter and another argument does not such as 
> > `enable_if` where the condition matters greatly but the message doesn't 
> > matter much). So we might need a marking on the attribute level *and* on 
> > the parameter level to determine what all factors into attribute identity 
> > for generating the ODR hashing code. Hopefully we can avoid needing more 
> > granularity than that.
> I feel the default behavior would be to do ODR hashes. I just took a quick 
> look in Attr.td. And I feel most of them affecting the code generation. For 
> example, there're a lot of attributes which is hardware related.
> 
> > because there are attribute arguments that don't always contribute to the 
> > entity
> 
> When you're talking about "entities", I'm not sure if you're talking the 
> "entities" in the language spec or a general abstract ones. I mean, for 
> example, the `always_inline` attribute doesn't contribute to the declaration 
> from the language perspective. But it'll be super odd if we lost it. So I 
> feel it may be better to change the requirement to be "affecting code 
> generation"
> 
> Also, to be honest, I am even not sure if we should take "affecting code 
> generation " as the requirement. For example, for the `preferred_name` 
> attribute, this is used for better printing names. It doesn't affect code 
> generation nor the entity you mentioned. But if we drop it, then the printing 
> name may not be what users want. I'm not sure if this is desired.
> I feel the default behavior would be to do ODR hashes. I just took a quick 
> look in Attr.td. And I feel most of them affecting the code generation. For 
> example, there're a lot of attributes which is hardware related.

Hmmm, I'm seeing a reasonably even split. We definitely have a ton of 
attributes like multiversioning, interrupts, calling conventions, availability, 
etc that I think all should be part of an ODR hash. We also have a ton of other 
ones that don't seem like they should matter for ODR hashing like hot/cold, 
consumed/returns retained, constructor/destructor, deprecated, nodiscard, 
maybe_unused, etc.

Then we have the really fun ones where I think we'll want them to be in the ODR 
hash but maybe we don't, like `[[clang::annotate]]`, restrict, etc.

So I think we really do need an actual audit of the attributes to decide what 
the default should be (if there should even be one).

> When you're talking about "entities", I'm not sure if you're talking the 
> "entities" in the language spec or a general abstract ones. I mean, for 
> example, the always_inline attribute doesn't contribute to the declaration 
> from the language perspective. But it'll be super odd if we lost it. So I 
> feel it may be better to change the requirement to be "affecting code 
> generation"

Not in a language spec meaning -- I meant mostly that there's some notion of 
identify for the one definition rule and not all attribute arguments contribute 
to that identity the same way.

I don't know that "affecting code gen" really captures the whole of it. For 
example, whether a function is/is not marked hot or cold affects codegen, but 
really has nothing to do with the identity of the function (it's the same 
function whether the hint is there or not). `preferred_name` is another example 
-- do we really think a structure with that attribute is fundamentally a 
different thing than one without that attribute? I don't think so.

To me, it's more about what cross-TU behaviors you'll get if the attribute is 
only present on one side. Things like ABI tags, calling conventions, attributes 
that behave like qualifiers, etc all contribute to the identity of something 
where a mismatch will impact the correctness of the program. But things like 
optimization hints and diagnostic hints don't seem like they contribute to the 
identity of something because they don't impact correctness.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D135472/new/

https://reviews.llvm.org/D135472

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D135472: [ODRHash] Hash attributes on declarations.

Reply via email to