[clang] [analyzer] MallocChecker: Recognize std::atomics in smart pointer suppression. (PR #90918)

Artem Dergachev via cfe-commits Mon, 06 May 2024 18:39:01 -0700

haoNoQ wrote:

> I've just left here what my thought process was when I dig into a similar 
> case. It might be useful, who knows.


Medium-to-long-term I think an attribute-based approach might make sense there:
- Either annotate reference-counting pointers as "hey I'm a smart pointer (I'm 
following a different safety model in which symbolic execution based analysis 
would do more harm than good) so please ignore memory managed by me";
- Or annotate reference-counted objects (i.e. objects that contain an intrusive 
reference count and are always managed by intrusive reference-counted pointers) 
with an attribute "hey I'm always well-managed, please ignore my allocations 
entirely (for the same reason)".

Or both.

I've definitely seen a few codebases, that are security-critical to a large-ish 
chunk of humanity, that have like 20 competing ad-hoc reference-counting 
schemes. Mostly in plain C where MallocChecker would otherwise be very useful, 
if it wasn't ruined by false positives of a similar nature from all these 
reference counting schemes. These schemes probably don't make sense to hardcode 
in the compiler because there's no inheritance so you'd need to hardcode every 
struct that follows one of those schemes, not just every scheme in and of 
itself.

> I considered modeling the atomic member functions, until the first place they 
> escape, after which we can no longer ever reason about them. This made me to 
> look into suppressions.

Yeah I think it's not sufficient to model the atomics. It's a good idea to 
model them anyway, but even when atomics aren't used (eg. the custom smart 
pointer never needed to be thread-safe), the fundamental problem is that we 
still don't know the *initial* value of the reference count. In particular we 
can't refute the possibility that the original value was like `-1` or something.

Modeling atomics would make it better when the smart pointer was first created 
*during* analysis, so we actually know that the initial value is 0. Then by 
carefully tracking it we might arrive to the correct conclusion. But when the 
initial value is symbolic we're fundamentally powerless without the 
domain-specific knowledge that the value could not have been `-1`.

It's possible that this domain-specific knowledge could be transferred to us 
with the help of well-placed assertions across the smart pointer class. But an 
attribute could achieve the same in a more direct manner.

> I believe, that such a heuristic with dominators could work for the rest of 
> the checkers - where the bug report is attached to some given statement - and 
> not delayed until some other future statement like in the leak checker.

Yes, I think there has to be something good in this area, even though I haven't 
got any good specific solutions in my head. @Szelethus did a lot of initial 
experimentation in this area, which resulted in improved condition tracking, 
but we haven't used it for false positive suppression yet. Once we research 
this deeper and make it more principled, maybe we should really start doing 
that.

It may also be a good idea to not look at the bug report in isolation, but 
consider the entire space of execution paths on which it was found. If the 
space isn't "vast" enough to convince us that we aren't stepping onto an 
#61669, suppress the warning.

https://github.com/llvm/llvm-project/pull/90918
_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [analyzer] MallocChecker: Recognize std::atomics in smart pointer suppression. (PR #90918)

Reply via email to