aaron.ballman added subscribers: aaronpuchert, delesley.
aaron.ballman added a comment.

In D130055#3683279 <https://reviews.llvm.org/D130055#3683279>, @beanz wrote:

> In D130055#3683173 <https://reviews.llvm.org/D130055#3683173>, @aaron.ballman 
> wrote:
>
>> Are there circumstances where we cannot "simply" infer this from the 
>> constructor itself?
>
> I think this gets tricky in cases where objects may have valid 
> "uninitialized" state. Like a tagged union that hasn't been set might set its 
> tag value to "uninitialized" and zero its storage, which would make it look 
> like all the members are initialized to the compiler, they just happen to be 
> initialized to known sentinel values.

Ah, sentinel values in general all have that same general property, good point!

>> (After instantiation) we know the class hierarchy, we know the data members, 
>> and we know the ctor init list/in-class initializers, so it seems like we 
>> should be able to recursively mark a constructor as yolo for the user rather 
>> than making them write it themselves. It seems like inferring this would 
>> reduce false positives and false negatives (in the case the user marks the 
>> constructor incorrectly or the code gets out of sync with the marking), and 
>> we might only need this attribute in very rare circumstances (or not at all, 
>> as a user-facing attribute). Is that an approach you're considering?
>
> I hadn't thought at all about automated propagation.

My experiences with thread safety analysis annotations was that inference (or 
even a tool to automatically slap the right annotations explicitly on to 
declarations) was a huge usability win for users. This particular kind of 
diagnostic tends to be somewhat "viral" in that starting to use the annotations 
only helps if everything in some subset of your project is annotated. You can 
start from the leaf nodes in your types to get some benefit, but the most 
benefit tends to come from composition of these leaf types into more complex 
types. Reducing the number of annotations the user has to write makes this 
transition significantly easier if possible to accomplish with good fidelity.

> One of the other use cases I was thinking about was `std::optional`, where 
> you could specify that initialization to `std::nullopt` is `yolo`, and then 
> `get` becomes `kaboom`.
>
> With these annotations we can conservatively diagnose cases where an 
> optional's value is accessed when uninitialized.
>
>> Fully? Partially? I'm trying to decide whether this is only useful for 
>> functions named `init()` and `clear()` and the like, or whether this is 
>> useful for functions like `setWhatever()` where you can have a partially 
>> constructed object that is not fully initialized by any one member function.
>
> I had also thought about this too. As implemented in this patch `woot` 
> implies fully initialized, but there could be some interesting complex 
> initializations. I could imagine builder patterns where given code like 
> `Builder::Create().X().Y().Z()`, you might want `Y` to only be callable after 
> `X` and `Z` might be the required final initializer.

I was thinking about our own AST nodes where, for example, you deserialize an 
AST by creating an empty node and then filling out its structure piecemeal. I 
think partial construction is a pretty important concept to consider early in 
the design given its prevalence as a reasonable implementation strategy.

>> Same question here about fully vs partially initialized. e.g., where you 
>> have a partially constructed object and you can only call `getWhatever()` 
>> after `setWhatever()` has been called.
>
> I'm wondering if there could be a generalization of the attribute like:
>
>   class TaggedValue {
>     enum Kind {
>       Uninitialized = 0,
>       Integer,
>       Float
>     };
>     Kind VK = Uninitialized;
>   
>     union {
>       int I;
>       float F;
>     };
>   public:
>     
>     [[clang::yolo]]
>     TaggedValue() = default;
>     
>     TaggedValue(TaggedValue&) = default;
>   
>     void hasValue() { return VK == Uninitialized; } // always safe
>   
>     [[clang::woot("FloatSet"]] // Marks as safe for functions with matching 
> kaboom arguments
>     void set(float V) {
>       VK= Float;
>       F = V;
>     }
>   
>     [[clang::woot("IntSet")]] // Marks as safe for functions with matching 
> kaboom arguments
>     void set(int V) {
>       VK= Integer;
>       I = V;
>     }
>   
>       [[clang::woot]] // Marks as safe for all kaboom functions (because I'm 
> sad)
>      void zero() {
>        VK= Integer;
>        I = 0;
>      }
>   
>     [[clang::kaboom("FloatSet"]]
>     operator float() {
>       return F;
>     }
>   
>     [[clang::kaboom("IntSet")]]
>     operator int() {
>       return I;
>     }
>   };
>
> This does get into some more complex questions of whether `woot` would change 
> or append status, and I can see arguments for both so we might need 
> `appending_woot` too...

The more I look at this, the more closely it seems to resemble attributes and 
an analysis pass we already have: capability attributes: 
https://clang.llvm.org/docs/ThreadSafetyAnalysis.html#basic-concepts-capabilities
 We implemented those for thread safety analysis initially, but realized 
there's a more general principle hiding under there in terms of resources and 
capabilities for using those resources. `yolo` is acquiring an "uninitialized" 
capability, `woot` is releasing the "uninitialized" capability, and `kaboom` is 
requiring there not be an "uninitialized" capability. I'm CC'ing in 
@aaronpuchert and @delesley to see if they also agree with this assessment 
(both Aaron and DeLesley have been instrumental in this work), but I'm curious 
if @beanz had seen these and considered them or not.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D130055/new/

https://reviews.llvm.org/D130055

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to