On Wed, 2023-06-07 at 16:21 -0400, Eric Feng wrote: > Hi everyone, > > I am one of the GSoC participants this year — in particular, I am > working on a static analyzer plugin for CPython extension module > code. > I'm encountering a few challenges and would appreciate any guidance > on > the following issues: > > 1) Issue with "inform" diagnostics in the plugin: > I am currently unable to see any "inform" messages from my plugin > when > compiling test programs with the plugin enabled. As per the structure > of existing analyzer plugins, I have included the following code in > the plugin_init function: > > #if ENABLE_ANALYZER > const char *plugin_name = plugin_info->base_name; > if (0) > inform(input_location, "got here; %qs", plugin_name);
If that's the code, does it work if you get rid of the "if (0)" conditional, or change it to "if (1)"? As written, that guard is false, so that call to "inform" will never be executed. > register_callback(plugin_info->base_name, > PLUGIN_ANALYZER_INIT, > ana::cpython_analyzer_init_cb, > NULL); > #else > sorry_no_analyzer(); > #endif > return 0; > > I expected to see the "got here" message (among others in other areas > of the plugin) when compiling test programs but haven't observed any > output. I also did not observe the "sorry" diagnostic. I am compiling > a simple CPython extension module with the plugin loaded like so: > > gcc-dev -S -fanalyzer -fplugin=/path/to/cpython_plugin.so > -I/usr/include/python3.9 -lpython3.9 -x c refcount6.c Looks reasonable. > > Additionally, I compiled the plugin following the steps outlined in > the GCC documentation for plugin building > (https://gcc.gnu.org/onlinedocs/gccint/Plugins-building.html): > > g++-dev -shared -I/home/flappy/gcc_/gcc/gcc > -I/usr/local/lib/gcc/aarch64-unknown-linux-gnu/14.0.0/plugin/include > -fPIC -fno-rtti -O2 analyzer_cpython_plugin.c -o cpython_plugin.so > > Please let me know if I missed any steps or if there is something > else > I should consider. I have no trouble seeing inform calls when they > are > added to the core GCC. > > 2) gdb not detecting .gdbinit in build/gcc: > Following Dave's GCC newbies guide, I ran gcc/configure within the > gcc > subdirectory of the build directory to generate a .gdbinit file. > Dave's guide suggested that this file would be automatically detected > and run by gdb. However, it appears that GDB is not detecting this > .gdbinit file, even after I added the following line to my ~/.gdbinit > file: > > add-auto-load-safe-path /absolute/path/to/build/gcc Are you invoking gcc from an installed copy, or from the build directory? I think my instructions assume the latter. > > 3) Modeling creation of a new PyObject: > Many CPython API calls involve the creation of a new PyObject. To > model the creation of a simple PyObject, we can allocate a new heap > region using get_or_create_region_for_heap_alloc. We can then create > field_regions using get_field_region to associate the newly allocated > region to represent fields such as ob_refcnt and ob_type in the > PyObject struct. However, one of the parameters to get _field_region > is a tree representing the field (e.g ob_refcnt). I'm currently > wondering how we may retrieve this information. My intuition is that > it would be fairly easy if we can first get a tree representation of > the PyObject struct. Since we include the relevant headers when > compiling CPython extension modules (e.g., -I/usr/include/python3.9), > I wonder if there is a way to "look up" the tree representation of > PyObject from the included headers. This information may also be > important for obtaining a svalue representing the size of the > PyObject > in get_or_create_region_for_heap_alloc. If there is no way to "look > up" a tree representation of PyObject as described in the included > Python header files, does it make sense for us to just create a tree > representation manually for this task? Please let me know if this > approach makes sense and if so where I could look into to get the > required information. Don't attempt to build the struct by hand; we want to look up the struct from the user's headers. There are at least two ABIs for PyObject, so we want to be sure we're using the correct one. IIRC, to look things up by name, that's generally a frontend thing, since every language has its own concept of scopes/namespaces/etc. It sounds like you want to look for a type in the global scope of the C/C++ FE with the name "PyObject". We currently have some hooks in the analyzer for getting constants from the frontends; see analyzer-language.cc, where the frontend calls on_finish_translation_unit, where the analyzer queries the FE for the named constants that will be of interest during analysis. Maybe we can extend this so that we have a way to look up named types there, and stash the tree for later use, and thus your plugin could ask the frontend which tree is the PyObject RECORD_TYPE before the frontend is cleaned up (in on_finish_translation_unit). There might be a simpler way to do this, but I can't think of it right now, sorry. Hope this is helpful Dave > > Thanks all. > > Best, > Eric >