On Tue, 2021-03-30 at 16:36 +0530, Ankur Saini wrote: > hello sir > > in my quest of finding a bug ( which ended up being a feature ) along > with it’s source in the analyzer, I tested the code on these 2 code > snippets and here’s how I went towards it > > (1) > int main() > { > int *ptr = (int *)malloc(sizeof(int)); > return 0; > } > > link to running example (https://godbolt.org/z/1jGW1qYez < > https://godbolt.org/z/1jGW1qYez>) > > (2) > int definaltly_main() > { > int *ptr = (int *)malloc(sizeof(int)); > return 0; > } > > link to running example (https://godbolt.org/z/bzjMYex4M < > https://godbolt.org/z/bzjMYex4M>) > > > where on second snipper analyzer is warning us about the leak as it > should be, but in the first one it isn’t. > > and as the gimple representation of both looks exactly the same apart > from function name, which made me think that either intentionally or > unintentionally, analyzer handles case of main() differently than any > other function.
Correct - the analyzer special-cases "main". Specifically, in impl_region_model_context::on_state_leak, there's this code: https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=gcc/analyzer/engine.cc;h=d7866b5598b4fcb791ec6ff511dde9b7615e7794;hb=HEAD#l624 624 /* Don't complain about leaks when returning from "main". */ 625 if (m_enode_for_diag->get_supernode () 626 && m_enode_for_diag->get_supernode ()->return_p ()) 627 { 628 tree fndecl = m_enode_for_diag->get_function ()->decl; 629 if (id_equal (DECL_NAME (fndecl), "main")) 630 { 631 if (logger) 632 logger->log ("not reporting leak from main"); 633 return; 634 } 635 } on the grounds that for the resources that the analyzer currently tracks, it doesn't matter if they aren't cleaned up as the process exits, so we don't bother the user with a report about them. > so while looking at it’s exploded graphs I found out that the only 2 > differences in them > > 1. There is one less exploded node(after node E-8) created in first > one ( I earlier thought state merging or state pruning is taking > place here but it isn’t because the results are not affected even > after disabling those using `-fno-analyzer-state-purge` and `-fno- > analyzer-state-merge` options ) Well spotted. I see that too. For (1) EN 8, has 2 stmts, the label and the return, but for (2), it splits them with EN: 8 having just the label, and the return split into EN 9. I was surprised by this and did some digging in gdb, for both (1) and (2). The reason seems to be rather arbitrary; specifically in (2), stmt_requires_new_enode_p is returning true due to hitting this case: 2894 /* If we had a PREV_STMT with an unknown location, and this stmt 2895 has a known location, then if a state change happens here, it 2896 could be consolidated into PREV_STMT, giving us an event with 2897 no location. Ensure that STMT gets its own exploded_node to 2898 avoid this. */ 2899 if (get_pure_location (prev_stmt->location) == UNKNOWN_LOCATION 2900 && get_pure_location (stmt->location) != UNKNOWN_LOCATION) 2901 return true; and presumably it isn't hitting this for (1). Specifically, where "stmt" is the return stmt and "prev_stmt" is the label statement. for (1): (gdb) p /x stmt->location $1 = 0x0 (gdb) p /x prev_stmt->location $2 = 0x0 whereas for (2): (gdb) p /x stmt->location $5 = 0x800000f2 (gdb) p /x prev_stmt->location $6 = 0x0 So with (1) the return stmt has UNKNOWN_LOCATION: (gdb) call inform (stmt->location, "return stmt in (1)") In function ‘main’: cc1: note: return stmt in (1) whereas for (2) the return stmt has a source location: (gdb) call inform (stmt->location, "return stmt in (2)") In function ‘definaltly_main’: /tmp/2.c:6:12: note: return stmt in (2) 6 | return 0; | ^ This slight difference in the recorded location of the return stmt for the "main" vs non-"main" case affects the splitting of the nodes. Athough a curiosity, I don't think this is significant. (In theory one could use a hardware watchpoint on stmt->location to track this discrepancy down further, but I don't think it's important enough to bother). > 2. no diagnosis for malloc leak happening at the end of first one > even though there exist a pointer in unchecked state at the end ( > according to the malloc state machine ) Correct - the analyzer specialcases "main" and ignores it, as noted above. > In quest to find the cause I started navigating through the source > code of the analyser starting off with the run_checkers() function in > engine.cc which looks like the entry point of the analyser ( found > via the commit history of the analyzer ). But finally it ended at > `impl_region_model_context::on_state_leak()` function where I found > out that analyzer is intentionally skipping the leak report when it > is in main. Sounds like you successfully tracked this down. > This gave rise to some questions > > 1. why does the analyzer make exceptions with the main() function ? The user's attention is important - we don't want to spam the user with unnecessary reports if we can help it. > 2. even if it is not complaining about the leak then this still > doesn’t explain the reason for one less exploded node in this case of > main() function. As above, it seems to be a side-effect of differences in source locations. > thanks > > - Ankur Hope this is helpful Dave