On Tue, Aug 29, 2023 at 12:32 AM Eric Feng <ef2...@columbia.edu> wrote:
>
> Hi Dave,
>
> Thanks for the feedback. I've addressed the changes you mentioned in
> addition to adding more test cases. I've also taken this chance to
> split the test files according to known function subclasses, as you previously
> suggested. Since there were also some changes to the core analyzer, I've done
> a
> bootstrap and regtested the patch as well. Does it look OK for trunk?
Apologies — I forgot to mention that bootstrap and regtest was done on
aarch64-unknown-linux-gnu.
>
> Best,
> Eric
>
> ---
>
> This patch introduces initial support for reference count checking of
> PyObjects in relation to the Python/C API for the CPython plugin.
> Additionally, the core analyzer underwent several modifications to
> accommodate this feature. These include:
>
> - Introducing support for callbacks at the end of
> region_model::pop_frame. This is our current point of validation for
> the reference count of PyObjects.
> - An added optional custom stmt_finder parameter to
> region_model_context::warn. This aids in emitting a diagnostic
> concerning the reference count, especially when the stmt_finder is
> NULL, which is currently the case during region_model::pop_frame.
>
> The current diagnostic we emit relating to the reference count
> appears as follows:
>
> rc3.c:23:10: warning: expected <variable name belonging to m_base_region> to
> have reference count: ‘1’ but ob_refcnt field is: ‘2’
> 23 | return list;
> | ^~~~
> ‘create_py_object’: events 1-4
> |
> | 4 | PyObject* item = PyLong_FromLong(3);
> | | ^~~~~~~~~~~~~~~~~~
> | | |
> | | (1) when ‘PyLong_FromLong’ succeeds
> | 5 | PyObject* list = PyList_New(1);
> | | ~~~~~~~~~~~~~
> | | |
> | | (2) when ‘PyList_New’ succeeds
> |......
> | 14 | PyList_Append(list, item);
> | | ~~~~~~~~~~~~~~~~~~~~~~~~~
> | | |
> | | (3) when ‘PyList_Append’ succeeds, moving buffer
> |......
> | 23 | return list;
> | | ~~~~
> | | |
> | | (4) here
> |
>
> This is a WIP in several ways:
> - Enhancing the diagnostic for better clarity. For instance, users should
> expect to see the variable name 'item' instead of the placeholder in the
> diagnostic above.
> - Currently, functions returning PyObject * are assumed to always produce
> a new reference.
> - The validation of reference count is only for PyObjects created within a
> function body. Verifying reference counts for PyObjects passed as
> parameters is not supported in this patch.
>
> gcc/analyzer/ChangeLog:
> PR analyzer/107646
> * engine.cc (impl_region_model_context::warn): New optional parameter.
> * exploded-graph.h (class impl_region_model_context): Likewise.
> * region-model.cc (region_model::pop_frame): New callback feature for
> * region_model::pop_frame.
> * region-model.h (struct append_regions_cb_data): Likewise.
> (class region_model): Likewise.
> (class region_model_context): New optional parameter.
> (class region_model_context_decorator): Likewise.
>
> gcc/testsuite/ChangeLog:
> PR analyzer/107646
> * gcc.dg/plugin/analyzer_cpython_plugin.c: Implements reference count
> * checking for PyObjects.
> * gcc.dg/plugin/cpython-plugin-test-2.c: Moved to...
> * gcc.dg/plugin/cpython-plugin-test-PyList_Append.c: ...here (and
> * added more tests).
> * gcc.dg/plugin/cpython-plugin-test-1.c: Moved to...
> * gcc.dg/plugin/cpython-plugin-test-no-plugin.c: ...here (and added
> * more tests).
> * gcc.dg/plugin/plugin.exp: New tests.
> * gcc.dg/plugin/cpython-plugin-test-PyList_New.c: New test.
> * gcc.dg/plugin/cpython-plugin-test-PyLong_FromLong.c: New test.
> * gcc.dg/plugin/cpython-plugin-test-refcnt-checking.c: New test.
>
> Signed-off-by: Eric Feng <ef2...@columbia.edu>
>
> ---
> gcc/analyzer/engine.cc | 8 +-
> gcc/analyzer/exploded-graph.h | 4 +-
> gcc/analyzer/region-model.cc | 3 +
> gcc/analyzer/region-model.h | 48 ++-
> .../gcc.dg/plugin/analyzer_cpython_plugin.c | 376 +++++++++++++++++-
> ....c => cpython-plugin-test-PyList_Append.c} | 56 +--
> .../plugin/cpython-plugin-test-PyList_New.c | 38 ++
> .../cpython-plugin-test-PyLong_FromLong.c | 38 ++
> ...st-1.c => cpython-plugin-test-no-plugin.c} | 0
> .../cpython-plugin-test-refcnt-checking.c | 78 ++++
> gcc/testsuite/gcc.dg/plugin/plugin.exp | 5 +-
> 11 files changed, 612 insertions(+), 42 deletions(-)
> rename gcc/testsuite/gcc.dg/plugin/{cpython-plugin-test-2.c =>
> cpython-plugin-test-PyList_Append.c} (64%)
> create mode 100644
> gcc/testsuite/gcc.dg/plugin/cpython-plugin-test-PyList_New.c
> create mode 100644
> gcc/testsuite/gcc.dg/plugin/cpython-plugin-test-PyLong_FromLong.c
> rename gcc/testsuite/gcc.dg/plugin/{cpython-plugin-test-1.c =>
> cpython-plugin-test-no-plugin.c} (100%)
> create mode 100644
> gcc/testsuite/gcc.dg/plugin/cpython-plugin-test-refcnt-checking.c
>
> diff --git a/gcc/analyzer/engine.cc b/gcc/analyzer/engine.cc
> index a1908cdb364..736a41ecdaf 100644
> --- a/gcc/analyzer/engine.cc
> +++ b/gcc/analyzer/engine.cc
> @@ -115,10 +115,12 @@ impl_region_model_context (program_state *state,
> }
>
> bool
> -impl_region_model_context::warn (std::unique_ptr<pending_diagnostic> d)
> +impl_region_model_context::warn (std::unique_ptr<pending_diagnostic> d,
> + const stmt_finder *custom_finder)
> {
> LOG_FUNC (get_logger ());
> - if (m_stmt == NULL && m_stmt_finder == NULL)
> + auto curr_stmt_finder = custom_finder ? custom_finder : m_stmt_finder;
> + if (m_stmt == NULL && curr_stmt_finder == NULL)
> {
> if (get_logger ())
> get_logger ()->log ("rejecting diagnostic: no stmt");
> @@ -129,7 +131,7 @@ impl_region_model_context::warn
> (std::unique_ptr<pending_diagnostic> d)
> bool terminate_path = d->terminate_path_p ();
> if (m_eg->get_diagnostic_manager ().add_diagnostic
> (m_enode_for_diag, m_enode_for_diag->get_supernode (),
> - m_stmt, m_stmt_finder, std::move (d)))
> + m_stmt, curr_stmt_finder, std::move (d)))
> {
> if (m_path_ctxt
> && terminate_path
> diff --git a/gcc/analyzer/exploded-graph.h b/gcc/analyzer/exploded-graph.h
> index 5a7ab645bfe..6e9a5ef58c7 100644
> --- a/gcc/analyzer/exploded-graph.h
> +++ b/gcc/analyzer/exploded-graph.h
> @@ -56,7 +56,8 @@ class impl_region_model_context : public
> region_model_context
> uncertainty_t *uncertainty,
> logger *logger = NULL);
>
> - bool warn (std::unique_ptr<pending_diagnostic> d) final override;
> + bool warn (std::unique_ptr<pending_diagnostic> d,
> + const stmt_finder *custom_finder = NULL) final override;
> void add_note (std::unique_ptr<pending_note> pn) final override;
> void add_event (std::unique_ptr<checker_event> event) final override;
> void on_svalue_leak (const svalue *) override;
> @@ -107,6 +108,7 @@ class impl_region_model_context : public
> region_model_context
> std::unique_ptr<sm_context> *out_sm_context)
> override;
>
> const gimple *get_stmt () const override { return m_stmt; }
> + const exploded_graph *get_eg () const override { return m_eg; }
>
> exploded_graph *m_eg;
> log_user m_logger;
> diff --git a/gcc/analyzer/region-model.cc b/gcc/analyzer/region-model.cc
> index 4f31a6dcf0f..eb4f976b83a 100644
> --- a/gcc/analyzer/region-model.cc
> +++ b/gcc/analyzer/region-model.cc
> @@ -82,6 +82,8 @@ along with GCC; see the file COPYING3. If not see
>
> namespace ana {
>
> +auto_vec<pop_frame_callback> region_model::pop_frame_callbacks;
> +
> /* Dump T to PP in language-independent form, for debugging/logging/dumping
> purposes. */
>
> @@ -5422,6 +5424,7 @@ region_model::pop_frame (tree result_lvalue,
> }
>
> unbind_region_and_descendents (frame_reg,POISON_KIND_POPPED_STACK);
> + notify_on_pop_frame (this, retval, ctxt);
> }
>
> /* Get the number of frames in this region_model's stack. */
> diff --git a/gcc/analyzer/region-model.h b/gcc/analyzer/region-model.h
> index 10b2a59e787..440ea6d828d 100644
> --- a/gcc/analyzer/region-model.h
> +++ b/gcc/analyzer/region-model.h
> @@ -236,6 +236,10 @@ public:
>
> struct append_regions_cb_data;
>
> +typedef void (*pop_frame_callback) (const region_model *model,
> + const svalue *retval,
> + region_model_context *ctxt);
> +
> /* A region_model encapsulates a representation of the state of memory, with
> a tree of regions, along with their associated values.
> The representation is graph-like because values can be pointers to
> @@ -532,6 +536,20 @@ class region_model
> get_builtin_kf (const gcall *call,
> region_model_context *ctxt = NULL) const;
>
> + static void
> + register_pop_frame_callback (const pop_frame_callback &callback)
> + {
> + pop_frame_callbacks.safe_push (callback);
> + }
> +
> + static void
> + notify_on_pop_frame (const region_model *model, const svalue *retval,
> + region_model_context *ctxt)
> + {
> + for (auto &callback : pop_frame_callbacks)
> + callback (model, retval, ctxt);
> + }
> +
> private:
> const region *get_lvalue_1 (path_var pv, region_model_context *ctxt) const;
> const svalue *get_rvalue_1 (path_var pv, region_model_context *ctxt) const;
> @@ -621,6 +639,7 @@ private:
> tree callee_fndecl,
> region_model_context *ctxt)
> const;
>
> + static auto_vec<pop_frame_callback> pop_frame_callbacks;
> /* Storing this here to avoid passing it around everywhere. */
> region_model_manager *const m_mgr;
>
> @@ -649,8 +668,10 @@ class region_model_context
> {
> public:
> /* Hook for clients to store pending diagnostics.
> - Return true if the diagnostic was stored, or false if it was deleted.
> */
> - virtual bool warn (std::unique_ptr<pending_diagnostic> d) = 0;
> + Return true if the diagnostic was stored, or false if it was deleted.
> + Optionally provide a custom stmt_finder. */
> + virtual bool warn (std::unique_ptr<pending_diagnostic> d,
> + const stmt_finder *custom_finder = NULL) = 0;
>
> /* Hook for clients to add a note to the last previously stored
> pending diagnostic. */
> @@ -757,6 +778,8 @@ class region_model_context
>
> /* Get the current statement, if any. */
> virtual const gimple *get_stmt () const = 0;
> +
> + virtual const exploded_graph *get_eg () const = 0;
> };
>
> /* A "do nothing" subclass of region_model_context. */
> @@ -764,7 +787,8 @@ class region_model_context
> class noop_region_model_context : public region_model_context
> {
> public:
> - bool warn (std::unique_ptr<pending_diagnostic>) override { return false; }
> + bool warn (std::unique_ptr<pending_diagnostic> d,
> + const stmt_finder *custom_finder) override { return false; }
> void add_note (std::unique_ptr<pending_note>) override;
> void add_event (std::unique_ptr<checker_event>) override;
> void on_svalue_leak (const svalue *) override {}
> @@ -812,6 +836,7 @@ public:
> }
>
> const gimple *get_stmt () const override { return NULL; }
> + const exploded_graph *get_eg () const override { return NULL; }
> };
>
> /* A subclass of region_model_context for determining if operations fail
> @@ -840,7 +865,8 @@ private:
> class region_model_context_decorator : public region_model_context
> {
> public:
> - bool warn (std::unique_ptr<pending_diagnostic> d) override
> + bool warn (std::unique_ptr<pending_diagnostic> d,
> + const stmt_finder *custom_finder)
> {
> if (m_inner)
> return m_inner->warn (std::move (d));
> @@ -978,6 +1004,14 @@ class region_model_context_decorator : public
> region_model_context
> return nullptr;
> }
>
> + const exploded_graph *get_eg () const override
> + {
> + if (m_inner)
> + return m_inner->get_eg ();
> + else
> + return nullptr;
> + }
> +
> protected:
> region_model_context_decorator (region_model_context *inner)
> : m_inner (inner)
> @@ -993,7 +1027,8 @@ protected:
> class annotating_context : public region_model_context_decorator
> {
> public:
> - bool warn (std::unique_ptr<pending_diagnostic> d) override
> + bool warn (std::unique_ptr<pending_diagnostic> d,
> + const stmt_finder *custom_finder) override
> {
> if (m_inner)
> if (m_inner->warn (std::move (d)))
> @@ -1158,7 +1193,8 @@ using namespace ::selftest;
> class test_region_model_context : public noop_region_model_context
> {
> public:
> - bool warn (std::unique_ptr<pending_diagnostic> d) final override
> + bool warn (std::unique_ptr<pending_diagnostic> d,
> + const stmt_finder *custom_finder) final override
> {
> m_diagnostics.safe_push (d.release ());
> return true;
> diff --git a/gcc/testsuite/gcc.dg/plugin/analyzer_cpython_plugin.c
> b/gcc/testsuite/gcc.dg/plugin/analyzer_cpython_plugin.c
> index 7cd72e8a886..b2caed8fc1b 100644
> --- a/gcc/testsuite/gcc.dg/plugin/analyzer_cpython_plugin.c
> +++ b/gcc/testsuite/gcc.dg/plugin/analyzer_cpython_plugin.c
> @@ -44,6 +44,7 @@
> #include "analyzer/region-model.h"
> #include "analyzer/call-details.h"
> #include "analyzer/call-info.h"
> +#include "analyzer/exploded-graph.h"
> #include "make-unique.h"
>
> int plugin_is_GPL_compatible;
> @@ -191,6 +192,372 @@ public:
> }
> };
>
> +/* This is just a copy of leak_stmt_finder for now (subject to change if
> + * necssary) */
> +
> +class refcnt_stmt_finder : public stmt_finder
> +{
> +public:
> + refcnt_stmt_finder (const exploded_graph &eg, tree var)
> + : m_eg (eg), m_var (var)
> + {
> + }
> +
> + std::unique_ptr<stmt_finder>
> + clone () const final override
> + {
> + return make_unique<refcnt_stmt_finder> (m_eg, m_var);
> + }
> +
> + const gimple *
> + find_stmt (const exploded_path &epath) final override
> + {
> + logger *const logger = m_eg.get_logger ();
> + LOG_FUNC (logger);
> +
> + if (m_var && TREE_CODE (m_var) == SSA_NAME)
> + {
> + /* Locate the final write to this SSA name in the path. */
> + const gimple *def_stmt = SSA_NAME_DEF_STMT (m_var);
> +
> + int idx_of_def_stmt;
> + bool found = epath.find_stmt_backwards (def_stmt, &idx_of_def_stmt);
> + if (!found)
> + goto not_found;
> +
> + /* What was the next write to the underlying var
> + after the SSA name was set? (if any). */
> +
> + for (unsigned idx = idx_of_def_stmt + 1; idx < epath.m_edges.length
> ();
> + ++idx)
> + {
> + const exploded_edge *eedge = epath.m_edges[idx];
> + if (logger)
> + logger->log ("eedge[%i]: EN %i -> EN %i", idx,
> + eedge->m_src->m_index,
> + eedge->m_dest->m_index);
> + const exploded_node *dst_node = eedge->m_dest;
> + const program_point &dst_point = dst_node->get_point ();
> + const gimple *stmt = dst_point.get_stmt ();
> + if (!stmt)
> + continue;
> + if (const gassign *assign = dyn_cast<const gassign *> (stmt))
> + {
> + tree lhs = gimple_assign_lhs (assign);
> + if (TREE_CODE (lhs) == SSA_NAME
> + && SSA_NAME_VAR (lhs) == SSA_NAME_VAR (m_var))
> + return assign;
> + }
> + }
> + }
> +
> + not_found:
> +
> + /* Look backwards for the first statement with a location. */
> + int i;
> + const exploded_edge *eedge;
> + FOR_EACH_VEC_ELT_REVERSE (epath.m_edges, i, eedge)
> + {
> + if (logger)
> + logger->log ("eedge[%i]: EN %i -> EN %i", i, eedge->m_src->m_index,
> + eedge->m_dest->m_index);
> + const exploded_node *dst_node = eedge->m_dest;
> + const program_point &dst_point = dst_node->get_point ();
> + const gimple *stmt = dst_point.get_stmt ();
> + if (stmt)
> + if (get_pure_location (stmt->location) != UNKNOWN_LOCATION)
> + return stmt;
> + }
> +
> + gcc_unreachable ();
> + return NULL;
> + }
> +
> +private:
> + const exploded_graph &m_eg;
> + tree m_var;
> +};
> +
> +class refcnt_mismatch : public pending_diagnostic_subclass<refcnt_mismatch>
> +{
> +public:
> + refcnt_mismatch (const region *base_region,
> + const svalue *ob_refcnt,
> + const svalue *actual_refcnt,
> + tree reg_tree)
> + : m_base_region (base_region), m_ob_refcnt (ob_refcnt),
> + m_actual_refcnt (actual_refcnt), m_reg_tree(reg_tree)
> + {
> + }
> +
> + const char *
> + get_kind () const final override
> + {
> + return "refcnt_mismatch";
> + }
> +
> + bool
> + operator== (const refcnt_mismatch &other) const
> + {
> + return (m_base_region == other.m_base_region
> + && m_ob_refcnt == other.m_ob_refcnt
> + && m_actual_refcnt == other.m_actual_refcnt);
> + }
> +
> + int get_controlling_option () const final override
> + {
> + return 0;
> + }
> +
> + bool
> + emit (rich_location *rich_loc, logger *) final override
> + {
> + diagnostic_metadata m;
> + bool warned;
> + // just assuming constants for now
> + auto actual_refcnt
> + = m_actual_refcnt->dyn_cast_constant_svalue ()->get_constant ();
> + auto ob_refcnt = m_ob_refcnt->dyn_cast_constant_svalue ()->get_constant
> ();
> + warned = warning_meta (
> + rich_loc, m, get_controlling_option (),
> + "expected <variable name belonging to m_base_region> to have "
> + "reference count: %qE but ob_refcnt field is: %qE",
> + actual_refcnt, ob_refcnt);
> +
> + // location_t loc = rich_loc->get_loc ();
> + // foo (loc);
> + return warned;
> + }
> +
> + void mark_interesting_stuff (interesting_t *interest) final override
> + {
> + if (m_base_region)
> + interest->add_region_creation (m_base_region);
> + }
> +
> +private:
> +
> + void foo(location_t loc) const
> + {
> + inform(loc, "something is up right here");
> + }
> + const region *m_base_region;
> + const svalue *m_ob_refcnt;
> + const svalue *m_actual_refcnt;
> + tree m_reg_tree;
> +};
> +
> +/* Retrieves the svalue associated with the ob_refcnt field of the base
> region.
> + */
> +static const svalue *
> +retrieve_ob_refcnt_sval (const region *base_reg, const region_model *model,
> + region_model_context *ctxt)
> +{
> + region_model_manager *mgr = model->get_manager ();
> + tree ob_refcnt_tree = get_field_by_name (pyobj_record, "ob_refcnt");
> + const region *ob_refcnt_region
> + = mgr->get_field_region (base_reg, ob_refcnt_tree);
> + const svalue *ob_refcnt_sval
> + = model->get_store_value (ob_refcnt_region, ctxt);
> + return ob_refcnt_sval;
> +}
> +
> +static void
> +increment_region_refcnt (hash_map<const region *, int> &map, const region
> *key)
> +{
> + bool existed;
> + auto &refcnt = map.get_or_insert (key, &existed);
> + refcnt = existed ? refcnt + 1 : 1;
> +}
> +
> +
> +/* Recursively fills in region_to_refcnt with the references owned by
> + pyobj_ptr_sval. */
> +static void
> +count_expected_pyobj_references (const region_model *model,
> + hash_map<const region *, int> ®ion_to_refcnt,
> + const svalue *pyobj_ptr_sval,
> + hash_set<const region *> &seen)
> +{
> + if (!pyobj_ptr_sval)
> + return;
> +
> + const auto *pyobj_region_sval = pyobj_ptr_sval->dyn_cast_region_svalue ();
> + const auto *pyobj_initial_sval = pyobj_ptr_sval->dyn_cast_initial_svalue
> ();
> + if (!pyobj_region_sval && !pyobj_initial_sval)
> + return;
> +
> + // todo: support initial sval (e.g passed in as parameter)
> + if (pyobj_initial_sval)
> + {
> + // increment_region_refcnt (region_to_refcnt,
> + // pyobj_initial_sval->get_region ());
> + return;
> + }
> +
> + const region *pyobj_region = pyobj_region_sval->get_pointee ();
> + if (!pyobj_region || seen.contains (pyobj_region))
> + return;
> +
> + seen.add (pyobj_region);
> +
> + if (pyobj_ptr_sval->get_type () == pyobj_ptr_tree)
> + increment_region_refcnt (region_to_refcnt, pyobj_region);
> +
> + const auto *curr_store = model->get_store ();
> + const auto *retval_cluster = curr_store->get_cluster (pyobj_region);
> + if (!retval_cluster)
> + return;
> +
> + const auto &retval_binding_map = retval_cluster->get_map ();
> +
> + for (const auto &binding : retval_binding_map)
> + {
> + const svalue *binding_sval = binding.second;
> + const svalue *unwrapped_sval = binding_sval->unwrap_any_unmergeable ();
> + const region *pointee = unwrapped_sval->maybe_get_region ();
> +
> + if (pointee && pointee->get_kind () == RK_HEAP_ALLOCATED)
> + count_expected_pyobj_references (model, region_to_refcnt,
> binding_sval,
> + seen);
> + }
> +}
> +
> +/* Compare ob_refcnt field vs the actual reference count of a region */
> +static void
> +check_refcnt (const region_model *model, region_model_context *ctxt,
> + const hash_map<const ana::region *,
> + int>::iterator::reference_pair region_refcnt)
> +{
> + region_model_manager *mgr = model->get_manager ();
> + const auto &curr_region = region_refcnt.first;
> + const auto &actual_refcnt = region_refcnt.second;
> + const svalue *ob_refcnt_sval = retrieve_ob_refcnt_sval (curr_region,
> model, ctxt);
> + const svalue *actual_refcnt_sval = mgr->get_or_create_int_cst (
> + ob_refcnt_sval->get_type (), actual_refcnt);
> +
> + if (ob_refcnt_sval != actual_refcnt_sval)
> + {
> + // todo: fix this
> + tree reg_tree = model->get_representative_tree (curr_region);
> +
> + const auto &eg = ctxt->get_eg ();
> + refcnt_stmt_finder finder (*eg, reg_tree);
> + auto pd = make_unique<refcnt_mismatch> (curr_region, ob_refcnt_sval,
> + actual_refcnt_sval, reg_tree);
> + if (pd && eg)
> + ctxt->warn (std::move (pd), &finder);
> + }
> +}
> +
> +static void
> +check_refcnts (const region_model *model, const svalue *retval,
> + region_model_context *ctxt,
> + hash_map<const region *, int> ®ion_to_refcnt)
> +{
> + for (const auto ®ion_refcnt : region_to_refcnt)
> + {
> + check_refcnt(model, ctxt, region_refcnt);
> + }
> +}
> +
> +/* Validates the reference count of all Python objects. */
> +void
> +pyobj_refcnt_checker (const region_model *model, const svalue *retval,
> + region_model_context *ctxt)
> +{
> + if (!ctxt)
> + return;
> +
> + auto region_to_refcnt = hash_map<const region *, int> ();
> + auto seen_regions = hash_set<const region *> ();
> +
> + count_expected_pyobj_references (model, region_to_refcnt, retval,
> seen_regions);
> + check_refcnts (model, retval, ctxt, region_to_refcnt);
> +}
> +
> +/* Counts the actual pyobject references from all clusters in the model's
> + * store. */
> +static void
> +count_all_references (const region_model *model,
> + hash_map<const region *, int> ®ion_to_refcnt)
> +{
> + for (const auto &cluster : *model->get_store ())
> + {
> + auto curr_region = cluster.first;
> + if (curr_region->get_kind () != RK_HEAP_ALLOCATED)
> + continue;
> +
> + increment_region_refcnt (region_to_refcnt, curr_region);
> +
> + auto binding_cluster = cluster.second;
> + for (const auto &binding : binding_cluster->get_map ())
> + {
> + const svalue *binding_sval = binding.second;
> +
> + const svalue *unwrapped_sval
> + = binding_sval->unwrap_any_unmergeable ();
> + // if (unwrapped_sval->get_type () != pyobj_ptr_tree)
> + // continue;
> +
> + const region *pointee = unwrapped_sval->maybe_get_region ();
> + if (!pointee || pointee->get_kind () != RK_HEAP_ALLOCATED)
> + continue;
> +
> + increment_region_refcnt (region_to_refcnt, pointee);
> + }
> + }
> +}
> +
> +static void
> +dump_refcnt_info (const hash_map<const region *, int> ®ion_to_refcnt,
> + const region_model *model, region_model_context *ctxt)
> +{
> + region_model_manager *mgr = model->get_manager ();
> + pretty_printer pp;
> + pp_format_decoder (&pp) = default_tree_printer;
> + pp_show_color (&pp) = pp_show_color (global_dc->printer);
> + pp.buffer->stream = stderr;
> +
> + for (const auto ®ion_refcnt : region_to_refcnt)
> + {
> + auto region = region_refcnt.first;
> + auto actual_refcnt = region_refcnt.second;
> + const svalue *ob_refcnt_sval
> + = retrieve_ob_refcnt_sval (region, model, ctxt);
> + const svalue *actual_refcnt_sval = mgr->get_or_create_int_cst (
> + ob_refcnt_sval->get_type (), actual_refcnt);
> +
> + region->dump_to_pp (&pp, true);
> + pp_string (&pp, " — ob_refcnt: ");
> + ob_refcnt_sval->dump_to_pp (&pp, true);
> + pp_string (&pp, " actual refcnt: ");
> + actual_refcnt_sval->dump_to_pp (&pp, true);
> + pp_newline (&pp);
> + }
> + pp_string (&pp, "~~~~~~~~\n");
> + pp_flush (&pp);
> +}
> +
> +class kf_analyzer_cpython_dump_refcounts : public known_function
> +{
> +public:
> + bool matches_call_types_p (const call_details &cd) const final override
> + {
> + return cd.num_args () == 0;
> + }
> + void impl_call_pre (const call_details &cd) const final override
> + {
> + region_model_context *ctxt = cd.get_ctxt ();
> + if (!ctxt)
> + return;
> + region_model *model = cd.get_model ();
> + auto region_to_refcnt = hash_map<const region *, int> ();
> + count_all_references(model, region_to_refcnt);
> + dump_refcnt_info(region_to_refcnt, model, ctxt);
> + }
> +};
> +
> /* Some concessions were made to
> simplify the analysis process when comparing kf_PyList_Append with the
> real implementation. In particular, PyList_Append performs some
> @@ -927,6 +1294,10 @@ cpython_analyzer_init_cb (void *gcc_data, void *
> /*user_data */)
> iface->register_known_function ("PyList_New", make_unique<kf_PyList_New>
> ());
> iface->register_known_function ("PyLong_FromLong",
> make_unique<kf_PyLong_FromLong> ());
> +
> + iface->register_known_function (
> + "__analyzer_cpython_dump_refcounts",
> + make_unique<kf_analyzer_cpython_dump_refcounts> ());
> }
> } // namespace ana
>
> @@ -940,8 +1311,9 @@ plugin_init (struct plugin_name_args *plugin_info,
> const char *plugin_name = plugin_info->base_name;
> if (0)
> inform (input_location, "got here; %qs", plugin_name);
> - ana::register_finish_translation_unit_callback (&stash_named_types);
> - ana::register_finish_translation_unit_callback (&stash_global_vars);
> + register_finish_translation_unit_callback (&stash_named_types);
> + register_finish_translation_unit_callback (&stash_global_vars);
> + region_model::register_pop_frame_callback(pyobj_refcnt_checker);
> register_callback (plugin_info->base_name, PLUGIN_ANALYZER_INIT,
> ana::cpython_analyzer_init_cb,
> NULL); /* void *user_data */
> diff --git a/gcc/testsuite/gcc.dg/plugin/cpython-plugin-test-2.c
> b/gcc/testsuite/gcc.dg/plugin/cpython-plugin-test-PyList_Append.c
> similarity index 64%
> rename from gcc/testsuite/gcc.dg/plugin/cpython-plugin-test-2.c
> rename to gcc/testsuite/gcc.dg/plugin/cpython-plugin-test-PyList_Append.c
> index 19b5c17428a..9912f9105d4 100644
> --- a/gcc/testsuite/gcc.dg/plugin/cpython-plugin-test-2.c
> +++ b/gcc/testsuite/gcc.dg/plugin/cpython-plugin-test-PyList_Append.c
> @@ -8,34 +8,6 @@
> #include <Python.h>
> #include "../analyzer/analyzer-decls.h"
>
> -PyObject *
> -test_PyList_New (Py_ssize_t len)
> -{
> - PyObject *obj = PyList_New (len);
> - if (obj)
> - {
> - __analyzer_eval (obj->ob_refcnt == 1); /* { dg-warning "TRUE" } */
> - __analyzer_eval (PyList_CheckExact (obj)); /* { dg-warning "TRUE" } */
> - }
> - else
> - __analyzer_dump_path (); /* { dg-message "path" } */
> - return obj;
> -}
> -
> -PyObject *
> -test_PyLong_New (long n)
> -{
> - PyObject *obj = PyLong_FromLong (n);
> - if (obj)
> - {
> - __analyzer_eval (obj->ob_refcnt == 1); /* { dg-warning "TRUE" } */
> - __analyzer_eval (PyLong_CheckExact (obj)); /* { dg-warning "TRUE" } */
> - }
> - else
> - __analyzer_dump_path (); /* { dg-message "path" } */
> - return obj;
> -}
> -
> PyObject *
> test_PyListAppend (long n)
> {
> @@ -43,6 +15,7 @@ test_PyListAppend (long n)
> PyObject *list = PyList_New (0);
> PyList_Append(list, item);
> return list; /* { dg-warning "leak of 'item'" } */
> + /* { dg-warning "reference count" "" { target *-*-* } .-1 } */
> }
>
> PyObject *
> @@ -67,6 +40,7 @@ test_PyListAppend_2 (long n)
> else
> __analyzer_eval (item->ob_refcnt == 2); /* { dg-warning "TRUE" } */
> return list; /* { dg-warning "leak of 'item'" } */
> + /* { dg-warning "reference count" "" { target *-*-* } .-1 } */
> }
>
>
> @@ -75,4 +49,30 @@ test_PyListAppend_3 (PyObject *item, PyObject *list)
> {
> PyList_Append (list, item);
> return list;
> +}
> +
> +PyObject *
> +test_PyListAppend_4 (long n)
> +{
> + PyObject *item = PyLong_FromLong (n);
> + PyObject *list = NULL;
> + PyList_Append(list, item);
> + return list;
> +}
> +
> +PyObject *
> +test_PyListAppend_5 ()
> +{
> + PyObject *list = PyList_New (0);
> + PyList_Append(list, NULL);
> + return list;
> +}
> +
> +PyObject *
> +test_PyListAppend_6 ()
> +{
> + PyObject *item = NULL;
> + PyObject *list = NULL;
> + PyList_Append(list, item);
> + return list;
> }
> \ No newline at end of file
> diff --git a/gcc/testsuite/gcc.dg/plugin/cpython-plugin-test-PyList_New.c
> b/gcc/testsuite/gcc.dg/plugin/cpython-plugin-test-PyList_New.c
> new file mode 100644
> index 00000000000..492d4f7d58d
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/plugin/cpython-plugin-test-PyList_New.c
> @@ -0,0 +1,38 @@
> +/* { dg-do compile } */
> +/* { dg-require-effective-target analyzer } */
> +/* { dg-options "-fanalyzer" } */
> +/* { dg-require-python-h "" } */
> +
> +
> +#define PY_SSIZE_T_CLEAN
> +#include <Python.h>
> +#include "../analyzer/analyzer-decls.h"
> +
> +PyObject *
> +test_PyList_New (Py_ssize_t len)
> +{
> + PyObject *obj = PyList_New (len);
> + if (obj)
> + {
> + __analyzer_eval (obj->ob_refcnt == 1); /* { dg-warning "TRUE" } */
> + __analyzer_eval (PyList_CheckExact (obj)); /* { dg-warning "TRUE" } */
> + }
> + else
> + __analyzer_dump_path (); /* { dg-message "path" } */
> + return obj;
> +}
> +
> +void
> +test_PyList_New_2 ()
> +{
> + PyObject *obj = PyList_New (0);
> +} /* { dg-warning "leak of 'obj'" } */
> +
> +PyObject *test_stray_incref_PyList ()
> +{
> + PyObject *p = PyList_New (2);
> + if (p)
> + Py_INCREF (p);
> + return p;
> + /* { dg-warning "reference count" "" { target *-*-* } .-1 } */
> +}
> \ No newline at end of file
> diff --git
> a/gcc/testsuite/gcc.dg/plugin/cpython-plugin-test-PyLong_FromLong.c
> b/gcc/testsuite/gcc.dg/plugin/cpython-plugin-test-PyLong_FromLong.c
> new file mode 100644
> index 00000000000..97b29849302
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/plugin/cpython-plugin-test-PyLong_FromLong.c
> @@ -0,0 +1,38 @@
> +/* { dg-do compile } */
> +/* { dg-require-effective-target analyzer } */
> +/* { dg-options "-fanalyzer" } */
> +/* { dg-require-python-h "" } */
> +
> +
> +#define PY_SSIZE_T_CLEAN
> +#include <Python.h>
> +#include "../analyzer/analyzer-decls.h"
> +
> +PyObject *
> +test_PyLong_New (long n)
> +{
> + PyObject *obj = PyLong_FromLong (n);
> + if (obj)
> + {
> + __analyzer_eval (obj->ob_refcnt == 1); /* { dg-warning "TRUE" } */
> + __analyzer_eval (PyLong_CheckExact (obj)); /* { dg-warning "TRUE" } */
> + }
> + else
> + __analyzer_dump_path (); /* { dg-message "path" } */
> + return obj;
> +}
> +
> +void
> +test_PyLong_New_2 (long n)
> +{
> + PyObject *obj = PyLong_FromLong (n);
> +} /* { dg-warning "leak of 'obj'" } */
> +
> +PyObject *test_stray_incref_PyLong (long val)
> +{
> + PyObject *p = PyLong_FromLong (val);
> + if (p)
> + Py_INCREF (p);
> + return p;
> + /* { dg-warning "reference count" "" { target *-*-* } .-1 } */
> +}
> \ No newline at end of file
> diff --git a/gcc/testsuite/gcc.dg/plugin/cpython-plugin-test-1.c
> b/gcc/testsuite/gcc.dg/plugin/cpython-plugin-test-no-plugin.c
> similarity index 100%
> rename from gcc/testsuite/gcc.dg/plugin/cpython-plugin-test-1.c
> rename to gcc/testsuite/gcc.dg/plugin/cpython-plugin-test-no-plugin.c
> diff --git
> a/gcc/testsuite/gcc.dg/plugin/cpython-plugin-test-refcnt-checking.c
> b/gcc/testsuite/gcc.dg/plugin/cpython-plugin-test-refcnt-checking.c
> new file mode 100644
> index 00000000000..9912f9105d4
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/plugin/cpython-plugin-test-refcnt-checking.c
> @@ -0,0 +1,78 @@
> +/* { dg-do compile } */
> +/* { dg-require-effective-target analyzer } */
> +/* { dg-options "-fanalyzer" } */
> +/* { dg-require-python-h "" } */
> +
> +
> +#define PY_SSIZE_T_CLEAN
> +#include <Python.h>
> +#include "../analyzer/analyzer-decls.h"
> +
> +PyObject *
> +test_PyListAppend (long n)
> +{
> + PyObject *item = PyLong_FromLong (n);
> + PyObject *list = PyList_New (0);
> + PyList_Append(list, item);
> + return list; /* { dg-warning "leak of 'item'" } */
> + /* { dg-warning "reference count" "" { target *-*-* } .-1 } */
> +}
> +
> +PyObject *
> +test_PyListAppend_2 (long n)
> +{
> + PyObject *item = PyLong_FromLong (n);
> + if (!item)
> + return NULL;
> +
> + __analyzer_eval (item->ob_refcnt == 1); /* { dg-warning "TRUE" } */
> + PyObject *list = PyList_New (n);
> + if (!list)
> + {
> + Py_DECREF(item);
> + return NULL;
> + }
> +
> + __analyzer_eval (list->ob_refcnt == 1); /* { dg-warning "TRUE" } */
> +
> + if (PyList_Append (list, item) < 0)
> + __analyzer_eval (item->ob_refcnt == 1); /* { dg-warning "TRUE" } */
> + else
> + __analyzer_eval (item->ob_refcnt == 2); /* { dg-warning "TRUE" } */
> + return list; /* { dg-warning "leak of 'item'" } */
> + /* { dg-warning "reference count" "" { target *-*-* } .-1 } */
> +}
> +
> +
> +PyObject *
> +test_PyListAppend_3 (PyObject *item, PyObject *list)
> +{
> + PyList_Append (list, item);
> + return list;
> +}
> +
> +PyObject *
> +test_PyListAppend_4 (long n)
> +{
> + PyObject *item = PyLong_FromLong (n);
> + PyObject *list = NULL;
> + PyList_Append(list, item);
> + return list;
> +}
> +
> +PyObject *
> +test_PyListAppend_5 ()
> +{
> + PyObject *list = PyList_New (0);
> + PyList_Append(list, NULL);
> + return list;
> +}
> +
> +PyObject *
> +test_PyListAppend_6 ()
> +{
> + PyObject *item = NULL;
> + PyObject *list = NULL;
> + PyList_Append(list, item);
> + return list;
> +}
> \ No newline at end of file
> diff --git a/gcc/testsuite/gcc.dg/plugin/plugin.exp
> b/gcc/testsuite/gcc.dg/plugin/plugin.exp
> index e1ed2d2589e..cbef6da8d86 100644
> --- a/gcc/testsuite/gcc.dg/plugin/plugin.exp
> +++ b/gcc/testsuite/gcc.dg/plugin/plugin.exp
> @@ -161,8 +161,9 @@ set plugin_test_list [list \
> taint-CVE-2011-0521-6.c \
> taint-antipatterns-1.c } \
> { analyzer_cpython_plugin.c \
> - cpython-plugin-test-1.c \
> - cpython-plugin-test-2.c } \
> + cpython-plugin-test-PyList_Append.c \
> + cpython-plugin-test-PyList_New.c \
> + cpython-plugin-test-PyLong_FromLong.c } \
> ]
>
> foreach plugin_test $plugin_test_list {
> --
> 2.30.2
>