Re: Update and Questions on CPython Extension Module -fanalyzer plugin development

David Malcolm via Gcc Thu, 27 Jul 2023 15:35:43 -0700

On Thu, 2023-07-27 at 18:13 -0400, Eric Feng wrote:
> Hi Dave,
> 
> Thanks for the comments!
> 
> [...]
> > Do you have any DejaGnu tests for this functionality?  For example,
> > given PyList_New
> >   https://docs.python.org/3/c-api/list.html#c.PyList_New
> > there could be a test like:
> > 
> > /* { dg-require-effective-target python_h } */
> > 
> > #define PY_SSIZE_T_CLEAN
> > #include <Python.h>
> > #include "analyzer-decls.h"
> > 
> > PyObject *
> > test_PyList_New (Py_ssize_t len)
> > {
> >   PyObject *obj = PyList_New (len);
> >   if (obj)
> >     {
> >      __analyzer_eval (obj->ob_refcnt == 1); /* { dg-warning "TRUE"
> > } */
> >      __analyzer_eval (PyList_Check (obj)); /* { dg-warning "TRUE" }
> > */
> >      __analyzer_eval (PyList_CheckExact (obj)); /* { dg-warning
> > "TRUE" } */
> >     }
> >   else
> >     __analyzer_dump_path (); /* { dg-warning "path" } */
> >   return obj;
> > }
> > 
> > ...or similar, to verify that we simulate that the call can both
> > succeed and fail, and to verify properties of the store along the
> > "success" path.  Caveat: I didn't look at exactly what properties
> > you're simulating, so the above tests might need adjusting.
> > 
> 
> I am currently in the process of developing more tests. Specific to
> the test you provided as an example, we are passing all cases except
> for PyList_Check. PyList_Check does not pass because I have not yet
> added support for the various definitions of tp_flags.


As noted in our chat earlier, I don't think we can easily make these
work.  Looking at CPython's implementation: PyList_Type's initializer
here:
https://github.com/python/cpython/blob/main/Objects/listobject.c#L3101
initializes tp_flags with the flags, but:
(a) we don't see that code when compiling a user's extension module
(b) even if we did, PyList_Type is non-const, so the analyzer has to
assume that tp_flags could have been written to since it was
initialized

In theory we could specialcase such lookups, so that, say, a plugin
could register assumptions into the analyzer about the value of bits
within (PyList_Type.tp_flags).

However, this seems like a future feature.

>  I also
> encountered a minor hiccup where PyList_CheckExact appeared to give
> "UNKNOWN" rather than "TRUE", but this has since been fixed. The
> problem was caused by accidentally using the tree representation of
> struct PyList_Type as opposed to struct PyList_Type * when creating a
> pointer sval to the region for Pylist_Type.

Ah, good.

> 
> [...]
> > 
> > > Let's consider the following example which lacks error checking:
> > > 
> > > PyObject* foo() {
> > >     PyObject item = PyLong_FromLong(10);
> > >     PyObject list = PyList_New(5);
> > >     return list;
> > > }
> > > 
> > > The states for when PyLong_FromLong fails and when
> > > PyLong_FromLong
> > > succeeds are merged before the call to PyObject* list =
> > > PyList_New(5).
> > 
> > Ideally we would emit a leak warning about the "success" case of
> > PyLong_FromLong here.  I think you're running into the problem of
> > the
> > "store" part of the program_state being separate from the "malloc"
> > state machine part of program_state - I'm guessing that you're
> > creating
> > a heap_allocated_region for the new python object, but the "malloc"
> > state machine isn't transitioning the pointer from "start" to
> > "assumed-
> > non-null".  Such state machine states inhibit state-merging, and so
> > this might solve your state-merging problem.
> > 
> > I think we need a way to call
> > malloc_state_machine::on_allocator_call
> > from outside of sm-malloc.cc.  See
> > region_model::on_realloc_with_move
> > for an example of how to do something similar.
> > 
> 
> Thank you for the suggestion — this worked great and has solved the
> issue!

Excellent!

Thanks for the update
Dave

Re: Update and Questions on CPython Extension Module -fanalyzer plugin development

Reply via email to