On Tue, 2021-04-06 at 17:56 +0530, Ankur Saini wrote: Hi Ankur.
Various replies inline below throughout. > > On 30-Mar-2021, at 7:27 PM, David Malcolm <dmalc...@redhat.com> > > wrote: > > > > This gave rise to some questions > > > > > > 1. why does the analyzer make exceptions with the main() function > > > ? > > > > The user's attention is important - we don't want to spam the user > > with > > unnecessary reports if we can help it. > > make sense. > > —— > > After fiddling around with a lot of C codes, I switched to C++ > programs in-order to find how exactly the analyzer doesn’t > understand exception handling and more interestingly calls to virtual > functions ( which I am thinking to work on this summer ). Sounds like a good focus. > It was comparatively harder to find such an example where it would > fail as looks like gcc do amazingly nice job at devirtualising the > function calls ( even with -O0 option ) but finally after a lot of > attempts and reading online about devirtualisation, I found this > particular example where the analyzer generates a false positive > > #include <cstdlib> > > struct A > { > virtual int foo (void) > { > return 42; > } > }; > > struct B: public A > { > int *ptr; > void alloc () > { > ptr = (int*)malloc(sizeof(int)); > } > int foo (void) > { > free(ptr); > return 0; > } > }; > > int test() > { > struct B b, *bptr=&b; > b.alloc(); > bptr->foo(); > return bptr->foo(); > } > > int main() > { > test(); > } > > working link of the above code (https://godbolt.org/z/n17WK4MxG < > https://godbolt.org/z/n17WK4MxG>) > > here as the analyzer doesn’t understand the call to virtual function, > wasn’t able to locate a double free in the program which was found at > runtime. Good work. > so I went through it’s exploded graph to see how exactly is this > being processed. And from the looks of things the anayzer doesn’t > understood the function call which according to me was the following > part in gimple representation : > > 1 = bptr_8->D.3795._vptr.A; > _2 = *_1; > OBJ_TYPE_REF(_2;(struct B)bptr_8->0) (bptr_8) > > after scanning the source-code a bit i found out that such calls were > being processed by "region_model::handle_unrecognized_call()” where > it just keeps track of reachable states from that node. Again, good detective work. Right - the analyzer "sees" the jump through a function pointer, and it doesn't yet have any special knowledge about what that function pointer might be. Given that we know we have a B, we ought to be able to assume that B::B initializes the vtable of the instance, and make assumptions about what the values in that vtable are. The analyzer doesn't have any of this special-case knowledge yet - hence bug 97114. > —— > > Questions > > 1. The link to the bug tracker for vfunc() [ > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97114 < > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97114> ] says that for > vfuncs() to be understood by anayzer, it ought to be able to > devirtualize calls, but is it possible to devirtualise all the calls > ? what if it is random or depends on user input ? It's not possible for the general case. Consider that there could be other subclasses of A that we don't about in this translation unit. But for this case, -fdump-ipa-analyzer=stderr shows this gimple at the start of "test": B::B (&b); bptr_8 = &b; B::alloc (&b); _1 = bptr_8->D.3290._vptr.A; _2 = *_1; OBJ_TYPE_REF(_2;(struct B)bptr_8->0) (bptr_8); As noted above, I think that we can add enough logic to the analyzer to so that it "knows" that B::B should stores a vtable ptr, and then when that vtable is accessed, it should know what functions are being pointed to. I think it would mean adding a new region subclass in the analyzer, where an instance represents the vtable for a given class in the user's code. For cases where we have a "base *" and don't know which subclass the instance is, we could potentially have the analyzer speculate about the subclasses it knows about, adding exploded edges to speculated calls for the various subclasses. I don't yet know if this is a good idea, but it seems worth experimenting with. > 2. Even though analyzer didn’t understood calls to virtual function, > it didn’t gave warning about a memory leak either which according to > it, should exist if the functions were never called to deallocate the > pointer ( after exploded node 152 and 118, the state of malloc > changes automatically ) ? If the analyzer "sees" a call to an unknown function (either through a unknown function pointer, or to a function it doesn't have the body of), it acts conservatively by resetting all of the state-machine state for the values that are reachable by the function call, to avoid false leak reports. Hence it resets the state of B's ptr from the "unchecked" allocation state back to the "start" state. > sorry if I am asking a lot of questions. It's fine - part of the point of GSoC is to learn, and by asking these questions on the mailing list you're getting me to explain things here, where everyone else can see them too, which helps improve the "bus factor" of the analyzer. Also, you've clearly taken the time to dig into the code and create examples, and you are asking good questions - thanks! Hope the above makes sense and is helpful Dave