> On 17-Jul-2021, at 2:57 AM, David Malcolm <dmalc...@redhat.com> wrote:
>
> On Fri, 2021-07-16 at 21:04 +0530, Ankur Saini wrote:
>>
>>
>>> On 15-Jul-2021, at 4:53 AM, David Malcolm <dmalc...@redhat.com>
>>> wrote:
>>>
>>> On Wed, 2021-07-14 at 22:41 +0530, Ankur Saini wrote:
>>>
>>>
>
> [...snip...]
>
>>>
>>>>
>>>> 2. ( pr100546.c <
>>>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100546 <
>>>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100546>>)
>>>> ```
>>>> #include <stdio.h>
>>>> #include <cstdlib.h>
>>>>
>>>> static void noReturn(const char *str) __attribute__((noreturn));
>>>> static void noReturn(const char *str) {
>>>> printf("%s\n", str);
>>>> exit(1);
>>>> }
>>>>
>>>> void (*noReturnPtr)(const char *str) = &noReturn;
>>>>
>>>> int main(int argc, char **argv) {
>>>> char *str = 0;
>>>> if (!str)
>>>> noReturnPtr(__FILE__);
>>>> return printf("%c\n", *str);
>>>> }
>>>> ```
>>>> (godbolt link <https://godbolt.org/z/aWfW51se3 <
>>>> https://godbolt.org/z/aWfW51se3>>)
>>>>
>>>> - But at the time of testing ( command used
>>>> was `make check-gcc RUNTESTFLAGS="-v -v
>>>> analyzer.exp=pr100546.c"`),
>>>> both of
>>>> them failed unexpectedly with Segmentation fault at the call
>>>>
>>>> - From further inspection, I found out that this is due
>>>> "-fanalyzer-call-summaries" option, which looks like activats
>>>> call
>>>> summaries
>>>>
>>>> - I would look into this in more details ( with gdb ) tomorrow,
>>>> right
>>>> now
>>>> my guess is that this is either due too the changes I did in
>>>> state-
>>>> purge.cc <http://purge.cc/>
>>>> or is a call-summary related problem ( I remember it not being
>>>> perfetly implemented right now).
>>>
>>> I'm not proud of the call summary code, so that may well be the
>>> problem.
>>>
>>> Are you able to use gdb on the analyzer? It ought to be fairly
>>> painless to identify where a segfault is happening, so let me know if
>>> you're running into any problems with that.
>>
>> Yes, I used gdb on the analyzer to go into details and looks like I was
>> correct, the program was crashing in “analysis_plan::use_summary_p ()”
>> on line 114 ( const cgraph_node *callee = edge->callee; ) where program
>> was trying to access callgraph edge which didn’t exist .
>>
>> I fixed it by simply making analyzer abort using call summaries in
>> absence of callgraph edge.
>>
>> File: {src-dir}/gcc/analyzer/analysis-plan.cc
>>
>> 105: bool
>> 106: analysis_plan::use_summary_p (const cgraph_edge *edge) const
>> 107: {
>> 108: /* Don't use call summaries if -fno-analyzer-call-summaries. */
>> 109: if (!flag_analyzer_call_summaries)
>> 110: return false;
>> 111:
>> 112: /* Don't use call summaries if there is no callgraph edge */
>> 113: if(!edge || !edge->callee)
>> 114: return false;
>>
>> and now the tests are passing successfully. ( both manually and via
>> DejaGnu ).
>
> Great.
>
>>
>> I have attached a sample patch of work done till now with this mail for
>> review ( I haven’t sent this one to the patches list as it’s change log
>> was not complete for now ).
>>
>> P.S. I have also sent another mail (
>> https://gcc.gnu.org/pipermail/gcc-patches/2021-July/575396.html <
>> https://gcc.gnu.org/pipermail/gcc-patches/2021-July/575396.html> ) to
>> patches list with the previous call-string patch and this time it
>> popped up in my inbox as it should, did you also received it now ?
>
> I can see it in the archive URL, but for some reason it's not showing
> up in my inbox. Bother. Please can you try resending it directly to
> me?
Ok, I have sent the call-string patch directly to you. I have actually sent 2
mails ( from different mail ids ) to check if it’s the id which is causing the
issue or the contents of the mail itself.
>
> Putting email issues to one side, the patch you linked to above looks
> good. To what extent has it been tested? If it bootstraps and passes
> the test suite, it's ready for trunk.
It bootstrapped successfully on a couple of the x86_64 machines ( on gcc farm )
And regress testing is underway.
>
> Note that over the last couple of days I pushed my "use of
> uninitialized values" detection work to trunk (aka master), along with
> various other changes, so it's worth pulling master and rebasing on top
> of that before testing. I *think* we've been touching different parts
> of the analyzer code, but there's a chance you might have to resolve
> some merge conflicts.
I have been constantly updating my branch as soon as I see a commit on analyzer
( on patches list ) to avoid any conflicts.
Till now I haven’t encountered any.
>
> As for the patch you attached to this email
> "[PATCH] analyzer: make analyer detect calls via function pointers"
> here's an initial review:
thanks for the review
>
>> diff --git a/gcc/analyzer/analysis-plan.cc b/gcc/analyzer/analysis-plan.cc
>> index 7dfc48e9c3e..1c7e4d2cc84 100644
>> --- a/gcc/analyzer/analysis-plan.cc
>> +++ b/gcc/analyzer/analysis-plan.cc
>> @@ -109,6 +109,10 @@ analysis_plan::use_summary_p (const cgraph_edge *edge)
>> const
>> if (!flag_analyzer_call_summaries)
>> return false;
>>
>> + /* Don't use call summaries if there is no callgraph edge */
>> + if(!edge || !edge->callee)
>> + return false;
>
> Is it possible for a cgraph_edge to have a NULL callee? (I don't think
> so, but I could be wrong)
Maybe not, I did it because in function supergraph_call_edge (), there was a
similar case where it was pointed out that this might be the case when call is
happening via function pointers.
File: {source_dir}/gcc/analyzer/supergraph.cc
72: /* Get the cgraph_edge, but only if there's an underlying function body. */
73:
74: cgraph_edge *
75: supergraph_call_edge (function *fun, gimple *stmt)
76: {
77: gcall *call = dyn_cast<gcall *> (stmt);
78: if (!call)
79: return NULL;
80: cgraph_edge *edge = cgraph_node::get (fun->decl)->get_edge (stmt);
81: if (!edge)
82: return NULL;
83: if (!edge->callee)
84: return NULL; /* e.g. for a function pointer. */
85: if (!get_ultimate_function_for_cgraph_edge (edge))
86: return NULL;
87: return edge;
88: }
>
> Nit: missing space between "if" and open-paren
Fixed
>
>> diff --git a/gcc/analyzer/engine.cc b/gcc/analyzer/engine.cc
>> index 7662a7f7bab..f45a614c0ab 100644
>> --- a/gcc/analyzer/engine.cc
>> +++ b/gcc/analyzer/engine.cc
>> @@ -3170,10 +3170,13 @@ exploded_graph::process_node (exploded_node *node)
>> break;
>> case PK_AFTER_SUPERNODE:
>> {
>> + bool found_a_superedge = false;
>> + bool is_an_exit_block = false;
>> /* If this is an EXIT BB, detect leaks, and potentially
>> create a function summary. */
>> if (point.get_supernode ()->return_p ())
>> {
>> + is_an_exit_block = true;
>> node->detect_leaks (*this);
>> if (flag_analyzer_call_summaries
>> && point.get_call_string ().empty_p ())
>> @@ -3201,6 +3204,7 @@ exploded_graph::process_node (exploded_node *node)
>> superedge *succ;
>> FOR_EACH_VEC_ELT (point.get_supernode ()->m_succs, i, succ)
>> {
>> + found_a_superedge = true;
>> if (logger)
>> logger->log ("considering SN: %i -> SN: %i",
>> succ->m_src->m_index, succ->m_dest->m_index);
>> @@ -3210,19 +3214,100 @@ exploded_graph::process_node (exploded_node *node)
>> point.get_call_string ());
>> program_state next_state (state);
>> uncertainty_t uncertainty;
>> +
>> + /* Check if now the analyzer know about the call via
>> + function pointer or not. */
>> + if (succ->m_kind == SUPEREDGE_INTRAPROCEDURAL_CALL &&
>> + !(succ->get_any_callgraph_edge()))
>
> Some formatting nits:
> - on a multiline conditional, our convention is to put the "&&" at the
> start of a line, rather than the end
> - missing space beween function name and open-paren in call.
>
> Hence this should be formatted as:
>
> if (succ->m_kind == SUPEREDGE_INTRAPROCEDURAL_CALL
> && !(succ->get_any_callgraph_edge ()))
Fixed
>
>
>> + {
>> + const program_point *this_point = &node->get_point();
>> + const program_state *this_state = &node->get_state ();
>> + const gcall *call
>> + = this_point->get_supernode ()->get_final_call ();
>> +
>> + impl_region_model_context ctxt (*this,
>> + node,
>> + this_state,
>> + &next_state,
>> + &uncertainty,
>> + this_point->get_stmt());
>> +
>> + region_model *model = this_state->m_region_model;
>> + tree fn_decl = model->get_fndecl_for_call(call,&ctxt);
>> + function *fun = DECL_STRUCT_FUNCTION(fn_decl);
>> + if(fun)
>> + {
>> + const supergraph *sg = &(this->get_supergraph());
>
> Might as well turn this into
>
>> + const supergraph &sg = get_supergraph ();
>
> and update various "sg->" into "sg." below.
Done
>
>> + supernode * sn_entry = sg->get_node_for_function_entry
>> (fun);
>> + supernode * sn_exit = sg->get_node_for_function_exit
>> (fun);
>> +
>> + program_point new_point
>> + = program_point::before_supernode (sn_entry,
>> + NULL,
>> +
>> point.get_call_string ());
>> +
>> + new_point.push_to_call_stack (sn_exit,
>> + next_point.get_supernode());
>> +
>> + next_state.push_call(*this, node, call, &uncertainty);
>
> Some logging here could be helpful.
Will do
>
>
> [...]
>
>> +
>> + /* Return from the calls which doesn't have a return superedge.
>> + Such case occurs when GCC's middle end didn't knew which
>> function to
>> + call but analyzer did */
>> + if((is_an_exit_block && !found_a_superedge) &&
>> + (!point.get_call_string().empty_p()))
>
> Similar formatting nits as above; this should look something like:
>
> if ((is_an_exit_block && !found_a_superedge)
> && !point.get_call_string ().empty_p ())
Fixed
>
>
> [...snip...]
>
>>
>> +/* do exatly what region_model::update_for_return_superedge() do
>> + but get the call info from CALL_STMT instead from a suerpedge and
>> + is availabe publicically */
>> +void
>> +region_model::update_for_return_gcall (const gcall *call_stmt,
>> + region_model_context *ctxt)
>> +{
>> + /* Get the region for the result of the call, within the caller frame. */
>> + const region *result_dst_reg = NULL;
>> + tree lhs = gimple_call_lhs (call_stmt);
>> + if (lhs)
>> + {
>> + /* Normally we access the top-level frame, which is:
>> + path_var (expr, get_stack_depth () - 1)
>> + whereas here we need the caller frame, hence "- 2" here. */
>> + gcc_assert (get_stack_depth () >= 2);
>> + result_dst_reg = get_lvalue (path_var (lhs, get_stack_depth () - 2),
>> + ctxt);
>> + }
>> +
>> + pop_frame (result_dst_reg, NULL, ctxt);
>> +}
>
> I *think* this can be consolidated with update_for_return_superedge.
> Maybe instead, just rename that to update_for_return, and update the
> callsite of update_for_return_superedge to get the call_stmt there.
>
> That way we avoid copy&paste repetition of code.
>
> I think something similar can be done for
> region_model::update_for_gcall and update_for_call_superedge.
Done
>
> [...snip...]
>
>> diff --git a/gcc/analyzer/supergraph.cc b/gcc/analyzer/supergraph.cc
>> index 8611d0f8689..ace9e7b128a 100644
>> --- a/gcc/analyzer/supergraph.cc
>> +++ b/gcc/analyzer/supergraph.cc
>> @@ -183,11 +183,34 @@ supergraph::supergraph (logger *logger)
>> m_stmt_to_node_t.put (stmt, node_for_stmts);
>> m_stmt_uids.make_uid_unique (stmt);
>> if (cgraph_edge *edge = supergraph_call_edge (fun, stmt))
>> - {
>> - m_cgraph_edge_to_caller_prev_node.put(edge, node_for_stmts);
>> - node_for_stmts = add_node (fun, bb, as_a <gcall *> (stmt),
>> NULL);
>> - m_cgraph_edge_to_caller_next_node.put (edge, node_for_stmts);
>> - }
>> + {
>> + m_cgraph_edge_to_caller_prev_node.put(edge,
>> node_for_stmts);
>> + node_for_stmts = add_node (fun, bb, as_a <gcall *>
>> (stmt),
>> + NULL);
>> + m_cgraph_edge_to_caller_next_node.put (edge,
>> node_for_stmts);
>> + }
>> + else
>> + {
>> + // maybe call is via a function pointer
>> + gcall *call = dyn_cast<gcall *> (stmt);
>> + if (call)
>
> You can combine the if and the decl/dyn_cast into one line:
>
> if (gcall *call = dyn_cast<gcall *> (stmt))
>
> which saves a little vertical space and (slightly) limits the scope of
> "call”.
Done
>
>> + {
>> + cgraph_edge *edge
>> + = cgraph_node::get (fun->decl)->get_edge (stmt);
>> + if (!edge || !edge->callee)
>> + {
>> + supernode *old_node_for_stmts = node_for_stmts;
>> + node_for_stmts = add_node (fun, bb, call, NULL);
>> +
>> + superedge *sedge
>> + = new callgraph_superedge (old_node_for_stmts,
>> + node_for_stmts,
>> +
>> SUPEREDGE_INTRAPROCEDURAL_CALL,
>> + NULL);
>> + add_edge (sedge);
>> + }
>> + }
>> + }
>> }
>
> [...snip...]
>
>> diff --git a/gcc/testsuite/gcc.dg/analyzer/function-ptr-4.c
>> b/gcc/testsuite/gcc.dg/analyzer/function-ptr-4.c
>> new file mode 100644
>> index 00000000000..1054ab4f240
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.dg/analyzer/function-ptr-4.c
>> @@ -0,0 +1,53 @@
>> +#include <stdio.h>
>> +#include <stdlib.h>
>> +
>> +void fun(int *int_ptr)
>> +{
>> + free(int_ptr); /* { dg-warning "double-'free' of 'int_ptr'" } */
>> +}
>> +
>> +void single_call()
>> +{
>> + int *int_ptr = (int*)malloc(sizeof(int));
>> + void (*fun_ptr)(int *) = &fun;
>> + (*fun_ptr)(int_ptr);
>> +}
>> +
>> +void double_call()
>> +{
>> + int *int_ptr = (int*)malloc(sizeof(int));
>> + void (*fun_ptr)(int *) = &fun;
>> + (*fun_ptr)(int_ptr);
>> + (*fun_ptr)(int_ptr);
>> +}
>> +
>> +/*{ dg-begin-multiline-output "" }
>
> Normally the test suite injects "-fdiagnostics-plain-output" into the
> options, which turns off the source code printing and ASCII art for
> interprocedural paths. If you want to test them via dg-begin-
> multiline-output you'll need to have a look at the directives at the
> top of one of the existing tests that uses dg-begin-multiline-output.
>
>> + 6 | free(int_ptr);
>> + | ^~~~~~~~~~~~~
>> + 'double_call': events 1-2
>> + |
>> + | 16 | void double_call()
>> + | | ^~~~~~~~~~~
>> + | | |
>> + | | (1) entry to 'double_call'
>> + | 17 | {
>> + | 18 | int *int_ptr = (int*)malloc(sizeof(int));
>> + | | ~~~~~~~~~~~~~~~~~~~
>> + | | |
>> + | | (2) allocated here
>> + |
>> + +--> 'fun': events 3-6
>> + |
>> + | 4 | void fun(int *int_ptr)
>> + | | ^~~
>> + | | |
>> + | | (3) entry to ‘fun’
>> + | | (5) entry to ‘fun’
>> + | 5 | {
>> + | 6 | free(int_ptr);
>> + | | ~~~~~~~~~~~~~
>> + | | |
>> + | | (4) first 'free' here
>> + | | (6) second 'free' here; first 'free' was at (4)
>> + |
>> +*/
>
> and normally there would be a terminating:
>
> { dg-end-multiline-output "" } */
>
> In any case, the output above is missing some events: I think ideally
> it would show the calls from *fun_ptr to fun and the returns, giving
> something like the following (which I mocked up by hand):
>
> 'double_call': events 1-3
> |
> | 16 | void double_call()
> | | ^~~~~~~~~~~
> | | |
> | | (1) entry to 'double_call'
> | 17 | {
> | 18 | int *int_ptr = (int*)malloc(sizeof(int));
> | | ~~~~~~~~~~~~~~~~~~~
> | | |
> | | (2) allocated here
> | ... |
> | 19 | (*fun_ptr)(int_ptr);
> | | ^~~~~~~~~~~~~~~~~~~
> | | |
> | | (3) calling 'fun' from 'double-call'
> |
> +--> 'fun': events 3-6
> |
> | 4 | void fun(int *int_ptr)
> | | ^~~
> | | |
> | | (4) entry to ‘fun’
> | 5 | {
> | 6 | free(int_ptr);
> | | ~~~~~~~~~~~~~
> | | |
> | | (5) first 'free' here
> |
> <------+
> |
> 'double_call': events 6-7
> |
> | 19 | (*fun_ptr)(int_ptr);
> | | ^~~~~~~~~~~~~~~~~~~
> | | |
> | | (6) returning to 'double-call' from 'fun'
> | 20 | (*fun_ptr)(int_ptr);
> | | ^~~~~~~~~~~~~~~~~~~
> | | |
> | | (7) calling 'fun' from 'double-call'
> |
> +--> 'fun': events 8-9
> |
> | 4 | void fun(int *int_ptr)
> | | ^~~
> | | |
> | | (8) entry to ‘fun’
> | 5 | {
> | 6 | free(int_ptr);
> | | ~~~~~~~~~~~~~
> | | |
> | | (9) second 'free' here; first 'free' was at (5)
>
> The events are created in diagnostic-manager.cc
>
> Am I right in thinking that there's a interprocedural superedge for the
> dynamically-discovered calls?
No there isn’t, such calls will only have an exploded edge and no
interprocedural superedge
>
> diagnostic_manager::add_events_for_superedge creates events for calls
> and returns, so maybe you just need to update the case
> SUPEREDGE_INTRAPROCEDURAL_CALL there, to do something for the
> "dynamically discovered edge" cases (compare it with the other cases in
> that function). You might need to update the existing call_event and
> return_event subclasses slightly (see checker-path.cc/h)
As we already have exploded edges representing the call, my idea was to add
event for such cases via custom edge info ( similar to what we have for longjmp
case ) instead of creating a special case in
diagnostic_manager::add_events_for_superedge ().
>
> Ideally, event (7) above would read
> "passing freed pointer 'int_ptr' in call to 'fun from 'double_call"
> but that's advanced material :)
>
> [...snip...]
>
> Hope this makes sense and is constructive
> Dave
Thanks
- Ankur