NoQ added a comment.

I'll just leave this tiny bit of analysis here, that we did on the whiteboard a 
while ago.

> The "intermediate" overload `void f(std::span<int>, int *)` stays there but 
> usually is not useful. If there are more than two parameters need to be 
> fixed, there will be more such useless "intermediate" overloads.

This isn't even the biggest problem. The bigger problem is that //the process 
doesn't actually stop at Step 2//.

Let me rewrite this more slowly.

Step 0 - the initial code:

  // original body, original signature
  void foo(int *p, int *q, size_t sz) {
     int i = p[5]; // warning: unsafe buffer usage over 'p'
     int j = q[5]; // warning: unsafe buffer usage over 'q'
  }

Step 1 - fix parameter `p`:

  // FIXED: original body, new signature
  void foo(span<int> p, int *q, size_t sz) {
     int i = p[5];
     int j = q[5]; // warning: unsafe buffer usage over 'q'
  }
  
  // NEW: compatibility overload from step 1 - original signature
  [[unsafe_buffer_usage]]
  void f(int *p, int *q, size_t sz) {
     foo(span<int>(p, <#size#>), q, sz);
  }

Step 2 - fix parameter `q` as well:

  // FIXED: original code, even newer signature
  void foo(span<int> p, span<int> q, size_t sz) {
     int i = p[5];
     int j = q[5];
  }
  
  // NEW: compatibility overload from step 2
  [[unsafe_buffer_usage]]
  void foo(span<int> p, int *q, size_t sz) {
     foo(p, span<int>(q, <#size#>), sz);
  }
  
  // UNCHANGED: compatibility overload from step 1 - still carries original 
signature
  [[unsafe_buffer_usage]]
  void foo(int *p, int *q, size_t sz) {
     foo(span<int>(p, sz), q, sz); // warning: unsafe buffer usage over 'q'
  }

Step 3 - now we have a warning inside the overload from step 1 because it's no 
longer calling the newest safe function, but it's now calling the compatibility 
overload from step 2. So we need to change the parameter `q` of the bottom-most 
function to `span`.

  // UNCHANGED: original code, even newer signature - unchanged
  void foo(span<int> p, span<int> q, size_t sz) {
     int i = p[5];
     int j = q[5];
  }
  
  // UNCHANGED: compatibility overload from step 2
  [[unsafe_buffer_usage]]
  void foo(span<int> p, int *q, size_t sz) {
     foo(p, span<int>(q, sz), sz);
  }
  
  // FIXED: compatibility overload from step 1 - new safe signature
  [[unsafe_buffer_usage]]
  void foo(int *p, span<int> q, size_t sz) {
     foo(span<int>(p, sz), q, sz);
  }
  
  // NEW: compatibility overload from step 3 - original signature of the 
overload from step 1, aka simply original signature
  [[unsafe_buffer_usage]]
  [[unsafe_buffer_usage]] // sic!
  void foo(int *p, int *q, size_t sz) {
     foo(p, span<int>(q, <#size#>), sz); // warning: unsafe buffer usage over 
'p'
  }

There are a few things that look wrong here.

The set of overloads still makes sense; we get ourselves 4 overloads in all 
combinations of spans and raw pointers. This on its own makes sense.

We've also somehow got the new raw overload (with two raw pointers) carry //two 
instances// of `[[unsafe_buffer_usage]]`. It lowkey makes sense because the 
function is unsafe with respect to both parameters, but we probably don't want 
people to write such code.

Now, here's where it gets really weird. You'd probably expect that new raw 
overload to call `foo(span<int>(p, sz), span<int>(q, sz), sz)`. If it just did 
that, it'd be perfectly reasonable code. But it doesn't do that! Indeed, by 
design the autogenerated compatibility overload calls the freshly-fixed 
function. Which in case of Step 3 was `foo(int *p, span<int> q, size_t sz)`.

Also, the new raw overload //already has a warning in it//! This is completely 
unacceptable. This happened because the function under fix was already 
annotated as `[[unsafe_buffer_usage]]`. The autofix wouldn't remove the 
attribute; the attribute is still completely appropriate there.

Finally, because there's a new warning, //the process doesn't stop after Step 3 
either//. I'll leave imagining what Step 4 would look like as an easy exercise 
to the reader ^.^

----

So one way or another, we really don't want to be in the business of 
incrementally fixing parameters one-by-one. Fixing all parameters at once is 
much easier than fixing them incrementally and discarding all analysis state on 
each step. Analysis of "partially spanified" code is much harder than analysis 
of raw C-style code, so we prefer single-step transformation in most cases.

Another direction that we can explore is to avoid auto-fixing functions that 
already carry `[[unsafe_buffer_usage]]`. Indeed, it's already a compatibility 
overload and we know it; it's not *supposed* to be safe! This will prevent Step 
3 from breaking things, so it'll leave us with 3 overloads - not ideal, but not 
incorrect either. It'll also eliminate the "double attribute" problem.

And we ultimately need to explore this direction because no matter how hard we 
avoid automatic partial transformations, people will still do manual partial 
transformations! And when they do, we need to demonstrate some reasonable 
behavior.

But for now let's focus on the basic use case.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D153059/new/

https://reviews.llvm.org/D153059

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to