On Thu, Feb 10, 2022 at 9:42 AM Jen Kris via Python-list <python-list@python.org> wrote: > > I have everything finished down to the last line (sentences = > gutenberg.sents(fileid)) where I use PyObject_Call to call gutenberg.sents, > but it segfaults. The fileid is a string -- the first fileid in this corpus > is "austen-emma.txt." > > pName = PyUnicode_FromString("nltk.corpus"); > pModule = PyImport_Import(pName); > > pSubMod = PyObject_GetAttrString(pModule, "gutenberg"); > pFidMod = PyObject_GetAttrString(pSubMod, "fileids"); > pSentMod = PyObject_GetAttrString(pSubMod, "sents"); > > pFileIds = PyObject_CallObject(pFidMod, 0); > pListItem = PyList_GetItem(pFileIds, listIndex); > pListStrE = PyUnicode_AsEncodedString(pListItem, "UTF-8", "strict"); > pListStr = PyBytes_AS_STRING(pListStrE); > Py_DECREF(pListStrE);
HERE. PyBytes_AS_STRING() returns pointer in the pListStrE Object. So Py_DECREF(pListStrE) makes pListStr a dangling pointer. > > // sentences = gutenberg.sents(fileid) > PyObject *c_args = Py_BuildValue("s", pListStr); Why do you encode&decode pListStrE? Why don't you use just pListStrE? > PyObject *NullPtr = 0; > pSents = PyObject_Call(pSentMod, c_args, NullPtr); > c_args must tuple, but you passed a unicode object here. Read https://docs.python.org/3/c-api/arg.html#c.Py_BuildValue > The final line segfaults: > Program received signal SIGSEGV, Segmentation fault. > 0x00007ffff6e4e8d5 in _PyEval_EvalCodeWithName () > from /usr/lib/x86_64-linux-gnu/libpython3.8.so.1.0 > > My guess is the problem is in Py_BuildValue, which returns a pointer but it > may not be constructed correctly. I also tried it with "O" and it doesn't > segfault but it returns 0x0. > > I'm new to using the C API. Thanks for any help. > > Jen > > > -- > https://mail.python.org/mailman/listinfo/python-list Bests, -- Inada Naoki <songofaca...@gmail.com> -- https://mail.python.org/mailman/listinfo/python-list