On Thu, 2023-07-27 at 18:13 -0400, Eric Feng wrote: > Hi Dave, > > Thanks for the comments! > > [...] > > Do you have any DejaGnu tests for this functionality? For example, > > given PyList_New > > https://docs.python.org/3/c-api/list.html#c.PyList_New > > there could be a test like: > > > > /* { dg-require-effective-target python_h } */ > > > > #define PY_SSIZE_T_CLEAN > > #include <Python.h> > > #include "analyzer-decls.h" > > > > PyObject * > > test_PyList_New (Py_ssize_t len) > > { > > PyObject *obj = PyList_New (len); > > if (obj) > > { > > __analyzer_eval (obj->ob_refcnt == 1); /* { dg-warning "TRUE" > > } */ > > __analyzer_eval (PyList_Check (obj)); /* { dg-warning "TRUE" } > > */ > > __analyzer_eval (PyList_CheckExact (obj)); /* { dg-warning > > "TRUE" } */ > > } > > else > > __analyzer_dump_path (); /* { dg-warning "path" } */ > > return obj; > > } > > > > ...or similar, to verify that we simulate that the call can both > > succeed and fail, and to verify properties of the store along the > > "success" path. Caveat: I didn't look at exactly what properties > > you're simulating, so the above tests might need adjusting. > > > > I am currently in the process of developing more tests. Specific to > the test you provided as an example, we are passing all cases except > for PyList_Check. PyList_Check does not pass because I have not yet > added support for the various definitions of tp_flags.
As noted in our chat earlier, I don't think we can easily make these work. Looking at CPython's implementation: PyList_Type's initializer here: https://github.com/python/cpython/blob/main/Objects/listobject.c#L3101 initializes tp_flags with the flags, but: (a) we don't see that code when compiling a user's extension module (b) even if we did, PyList_Type is non-const, so the analyzer has to assume that tp_flags could have been written to since it was initialized In theory we could specialcase such lookups, so that, say, a plugin could register assumptions into the analyzer about the value of bits within (PyList_Type.tp_flags). However, this seems like a future feature. > I also > encountered a minor hiccup where PyList_CheckExact appeared to give > "UNKNOWN" rather than "TRUE", but this has since been fixed. The > problem was caused by accidentally using the tree representation of > struct PyList_Type as opposed to struct PyList_Type * when creating a > pointer sval to the region for Pylist_Type. Ah, good. > > [...] > > > > > Let's consider the following example which lacks error checking: > > > > > > PyObject* foo() { > > > PyObject item = PyLong_FromLong(10); > > > PyObject list = PyList_New(5); > > > return list; > > > } > > > > > > The states for when PyLong_FromLong fails and when > > > PyLong_FromLong > > > succeeds are merged before the call to PyObject* list = > > > PyList_New(5). > > > > Ideally we would emit a leak warning about the "success" case of > > PyLong_FromLong here. I think you're running into the problem of > > the > > "store" part of the program_state being separate from the "malloc" > > state machine part of program_state - I'm guessing that you're > > creating > > a heap_allocated_region for the new python object, but the "malloc" > > state machine isn't transitioning the pointer from "start" to > > "assumed- > > non-null". Such state machine states inhibit state-merging, and so > > this might solve your state-merging problem. > > > > I think we need a way to call > > malloc_state_machine::on_allocator_call > > from outside of sm-malloc.cc. See > > region_model::on_realloc_with_move > > for an example of how to do something similar. > > > > Thank you for the suggestion — this worked great and has solved the > issue! Excellent! Thanks for the update Dave