[Python-Dev] Can we stop adding to the C API, please?
Hi, The size of the C API, as measured by `git grep PyAPI_FUNC | wc -l` has been steadily increasing over the last few releases. 3.5 1237 3.6 1304 3.7 1408 3.8 1478 3.9 1518 For reference the 2.7 branch has "only" 973 functions I've heard many criticisms of Python 2 over the years, but that it needed a bigger C API wasn't one of them ;) Why are these functions being added? Wasn't 1000 C functions enough? Every one of these functions represents a maintenance burden. Removing them is painful and takes a lot of effort, but adding them is done casually, without a PEP or, in many cases, even a review. We need to address what to do about the C API in the long term, but for now can we just stop making it larger? Please. Also, can we remove all the new API functions added in 3.9 before the release and it is too late? Cheers, Mark. ___ Python-Dev mailing list -- [email protected] To unsubscribe send an email to [email protected] https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/[email protected]/message/6CE75BIJC2GSQBO2MUJHW3MA6Q2MAWCB/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: Can we stop adding to the C API, please?
Hi, In Python 3.9, I *removed* dozens of functions from the *public* C API, or moved them to the "internal" C API: https://docs.python.org/dev/whatsnew/3.9.html#id3 For a few internal C API, I replaced PyAPI_FUNC() with extern to ensure that they cannot be used outside CPython code base: Python 3.9 is now built with -fvisibility=hidden on compilers supporting it (like GCC and clang). I also *added* a bunch of *new* "getter" or "setter" functions to the public C API for my project of hiding implementation details, like making structures opaque: https://docs.python.org/dev/whatsnew/3.9.html#id1 For example, I added PyThreadState_GetInterpreter() which replaces "tstate->interp", to prepare C extensions for an opaque PyThreadState structure. The other 4 new Python 3.9 functions: * PyObject_CallNoArgs(): "most efficient way to call a callable Python object without any argument" * PyModule_AddType(): "adding a type to a module". I hate the PyObject_AddObject() function which steals a reference on success. * PyObject_GC_IsTracked() and PyObject_GC_IsFinalized(): "query if Python objects are being currently tracked or have been already finalized by the garbage collector respectively": functions requested in bpo-40241. Would you mind to elaborate why you consider that these functions must not be added to Python 3.9? > Every one of these functions represents a maintenance burden. > Removing them is painful and takes a lot of effort, but adding them is > done casually, without a PEP or, in many cases, even a review. For the new functions related to hiding implementation details, I have a draft PEP: https://github.com/vstinner/misc/blob/master/cpython/pep-opaque-c-api.rst But it seems like this PEP is trying to solve too many problems in a single document, and that I have to split it into multiple PEPs. > Why are these functions being added? Wasn't 1000 C functions enough? My PEP lists flaws of the existing C API functions. Sadly, fixing flaws requires adding new functions and deprecating old ones in a slow migration. I'm open to ideas how to fix these flaws differently (without having new functions?). As written in my PEP, another approach is to design a new C API on top of the existing one. That's exactly what the HPy project does. But my PEP also explains why I consider that it only fixes a subset of the issues that I listed. ;-) https://github.com/vstinner/misc/blob/master/cpython/pep-opaque-c-api.rst#hpy-project Victor -- Night gathers, and now my watch begins. It shall not end until my death. ___ Python-Dev mailing list -- [email protected] To unsubscribe send an email to [email protected] https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/[email protected]/message/2FSMLZ22XJXGSQQHXDSZHOFOVPETPVWS/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: Can we stop adding to the C API, please?
Maybe we can have a two-for-one special? You can add a new function to the API if you deprecate two. ___ Python-Dev mailing list -- [email protected] To unsubscribe send an email to [email protected] https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/[email protected]/message/XE6ZO5Z4LKTJBVE3P77AMLR5SDQ2RQXA/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: Can we stop adding to the C API, please?
Hi Victor, On 03/06/2020 2:42 pm, Victor Stinner wrote: Hi, In Python 3.9, I *removed* dozens of functions from the *public* C API, or moved them to the "internal" C API: https://docs.python.org/dev/whatsnew/3.9.html#id3 For a few internal C API, I replaced PyAPI_FUNC() with extern to ensure that they cannot be used outside CPython code base: Python 3.9 is now built with -fvisibility=hidden on compilers supporting it (like GCC and clang). I also *added* a bunch of *new* "getter" or "setter" functions to the public C API for my project of hiding implementation details, like making structures opaque: https://docs.python.org/dev/whatsnew/3.9.html#id1 Adding "setters" is generally a bad idea. "getters" can be computed if the underlying field disappears, but the same may not be true for setters if the relation is not one-to-one. I don't think there are any new setters in 3.9, so it's not an immediate problem. For example, I added PyThreadState_GetInterpreter() which replaces "tstate->interp", to prepare C extensions for an opaque PyThreadState structure. `PyThreadState_GetInterpreter()` can't replace `tstate->interp` for two reasons. 1. There is no way to stop third party C code accessing the internals of data structures. We can warn them not to, but that's all. 2. The internal layout of C structures has never been part of the API, with arguably two exceptions; the PyTypeObject struct and the `ob_refcnt` field of PyObject. The other 4 new Python 3.9 functions: * PyObject_CallNoArgs(): "most efficient way to call a callable Python object without any argument" * PyModule_AddType(): "adding a type to a module". I hate the PyObject_AddObject() function which steals a reference on success. * PyObject_GC_IsTracked() and PyObject_GC_IsFinalized(): "query if Python objects are being currently tracked or have been already finalized by the garbage collector respectively": functions requested in bpo-40241. Would you mind to elaborate why you consider that these functions must not be added to Python 3.9? I'm not saying that no C functions should be added to the API. I am saying that none should be added without a PEP or proper review. Addressing the four function you list. PyObject_CallNoArgs() seems harmless. Rationalizing the call API has merit, but PyObject_CallNoArgs() leads to PyObject_CallOneArg(), PyObject_CallTwoArgs(), etc. and an even larger API. PyModule_AddType(). This seems perfectly reasonable, although if it is a straight replacement for another function, that other function should be deprecated. PyObject_GC_IsTracked(). I don't like this. Shouldn't GC track *all* objects? Even if it were named PyObject_Cycle_GC_IsTracked() it would be exposing internal implementation details for no good reason. A cycle GC that doesn't "track" individual objects, but treats all objects the same could be more efficient. In which case, what would this mean? What is the purpose of PyObject_GC_IsFinalized()? Third party objects can easily tell if they have been finalized. Why they would ever need this information is a mystery to me. Every one of these functions represents a maintenance burden. Removing them is painful and takes a lot of effort, but adding them is done casually, without a PEP or, in many cases, even a review. For the new functions related to hiding implementation details, I have a draft PEP: https://github.com/vstinner/misc/blob/master/cpython/pep-opaque-c-api.rst But it seems like this PEP is trying to solve too many problems in a single document, and that I have to split it into multiple PEPs. It does need splitting up, I agree. Why are these functions being added? Wasn't 1000 C functions enough? My PEP lists flaws of the existing C API functions. Sadly, fixing flaws requires adding new functions and deprecating old ones in a slow migration. IMO, at least one function should be deprecated for each new function added. That way the API won't get any bigger. Cheers, Mark. I'm open to ideas how to fix these flaws differently (without having new functions?). > As written in my PEP, another approach is to design a new C API on top of the existing one. That's exactly what the HPy project does. But my PEP also explains why I consider that it only fixes a subset of the issues that I listed. ;-) https://github.com/vstinner/misc/blob/master/cpython/pep-opaque-c-api.rst#hpy-project Victor ___ Python-Dev mailing list -- [email protected] To unsubscribe send an email to [email protected] https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/[email protected]/message/5WLOHBCSFVMZJEFSJSKQQANZASU2WFV3/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: Can we stop adding to the C API, please?
Just some comments on the GC stuff as I added them myself. > Shouldn't GC track *all* objects? No, extension types need to opt-in to the garbage collector and if so, implement the interface. > Even if it were named PyObject_Cycle_GC_IsTracked() it would be exposing internal implementation details for no good reason. In python, there is gc.is_tracked() in Python 3.1 and the GC module already exposes a lot of GC functionality since many versions ago. This just allows the same calls that you can do in Python using the C-API. >What is the purpose of PyObject_GC_IsFinalized()? Because some objects may have been resurrected and this allows you to know if a given object has already been finalized. This can help to gather advance GC stats, to control some tricky situations with finalizers and the gc in C extensions or just to know all objects that are being resurrected. Note that an equivalent gc.is_finalized() was added in 3.8 as well to query this information from Python in the GC module and this call just allows you to do the same from the C-API. Cheers, Pablo On Wed, 3 Jun 2020 at 18:26, Mark Shannon wrote: > Hi Victor, > > On 03/06/2020 2:42 pm, Victor Stinner wrote: > > Hi, > > > > In Python 3.9, I *removed* dozens of functions from the *public* C > > API, or moved them to the "internal" C API: > > https://docs.python.org/dev/whatsnew/3.9.html#id3 > > > > For a few internal C API, I replaced PyAPI_FUNC() with extern to > > ensure that they cannot be used outside CPython code base: Python 3.9 > > is now built with -fvisibility=hidden on compilers supporting it (like > > GCC and clang). > > > > I also *added* a bunch of *new* "getter" or "setter" functions to the > > public C API for my project of hiding implementation details, like > > making structures opaque: > > https://docs.python.org/dev/whatsnew/3.9.html#id1 > > Adding "setters" is generally a bad idea. > "getters" can be computed if the underlying field disappears, but the > same may not be true for setters if the relation is not one-to-one. > I don't think there are any new setters in 3.9, so it's not an immediate > problem. > > > > > For example, I added PyThreadState_GetInterpreter() which replaces > > "tstate->interp", to prepare C extensions for an opaque PyThreadState > > structure. > > `PyThreadState_GetInterpreter()` can't replace `tstate->interp` for two > reasons. > 1. There is no way to stop third party C code accessing the internals of > data structures. We can warn them not to, but that's all. > 2. The internal layout of C structures has never been part of the API, > with arguably two exceptions; the PyTypeObject struct and the > `ob_refcnt` field of PyObject. > > > > > The other 4 new Python 3.9 functions: > > > > * PyObject_CallNoArgs(): "most efficient way to call a callable Python > > object without any argument" > > * PyModule_AddType(): "adding a type to a module". I hate the > > PyObject_AddObject() function which steals a reference on success. > > * PyObject_GC_IsTracked() and PyObject_GC_IsFinalized(): "query if > > Python objects are being currently tracked or have been already > > finalized by the garbage collector respectively": functions requested > > in bpo-40241. > > > > Would you mind to elaborate why you consider that these functions must > > not be added to Python 3.9? > > I'm not saying that no C functions should be added to the API. I am > saying that none should be added without a PEP or proper review. > > Addressing the four function you list. > > PyObject_CallNoArgs() seems harmless. > Rationalizing the call API has merit, but PyObject_CallNoArgs() > leads to PyObject_CallOneArg(), PyObject_CallTwoArgs(), etc. and an even > larger API. > > PyModule_AddType(). This seems perfectly reasonable, although if it is a > straight replacement for another function, that other function should be > deprecated. > > PyObject_GC_IsTracked(). I don't like this. > Shouldn't GC track *all* objects? > Even if it were named PyObject_Cycle_GC_IsTracked() it would be exposing > internal implementation details for no good reason. A cycle GC that > doesn't "track" individual objects, but treats all objects the same > could be more efficient. In which case, what would this mean? > > What is the purpose of PyObject_GC_IsFinalized()? > Third party objects can easily tell if they have been finalized. > Why they would ever need this information is a mystery to me. > > > > > > >> Every one of these functions represents a maintenance burden. > >> Removing them is painful and takes a lot of effort, but adding them is > >> done casually, without a PEP or, in many cases, even a review. > > > > For the new functions related to hiding implementation details, I have > > a draft PEP: > > > https://github.com/vstinner/misc/blob/master/cpython/pep-opaque-c-api.rst > > > > But it seems like this PEP is trying to solve too many problems in a > > single document, and that I have to split it into multiple PEPs. > > > > It does need splitting up, I agr
[Python-Dev] Re: Can we stop adding to the C API, please?
Le mer. 3 juin 2020 à 19:17, Mark Shannon a écrit : > > I also *added* a bunch of *new* "getter" or "setter" functions to the > > public C API for my project of hiding implementation details, like > > making structures opaque: > > https://docs.python.org/dev/whatsnew/3.9.html#id1 > > Adding "setters" is generally a bad idea. > "getters" can be computed if the underlying field disappears, but the > same may not be true for setters if the relation is not one-to-one. > I don't think there are any new setters in 3.9, so it's not an immediate > problem. You're making the assumption that the member can be set directly. But my plan is to make the structure opaque. In that case, you need getters and setters for all fields you would like to access. No member would be accessible directly anymore. > `PyThreadState_GetInterpreter()` can't replace `tstate->interp` for two > reasons. > 1. There is no way to stop third party C code accessing the internals of > data structures. We can warn them not to, but that's all. > 2. The internal layout of C structures has never been part of the API, > with arguably two exceptions; the PyTypeObject struct and the > `ob_refcnt` field of PyObject. My long term plan is to make all structures opaque :-) So far, PyInterpreterState structure was made opaque in Python 3.7. It helped *a lot* the development of Python 3.8 and 3.9, especially for subinterpreters. And I made PyGC_Head opaque in Python 3.9. Examples of issues to make structures opaque: PyGC_Head: https://bugs.python.org/issue40241 (done in Python 3.9) PyObject: https://bugs.python.org/issue39573 PyTypeObject: https://bugs.python.org/issue40170 PyThreadState: https://bugs.python.org/issue39947 PyInterpreterState: https://bugs.python.org/issue35886 (done in Python 3.8) For the short term, my plan is to make structure opaque in the limited C API, before breaking more stuff in the public C API :-) > PyObject_CallNoArgs() seems harmless. > Rationalizing the call API has merit, but PyObject_CallNoArgs() > leads to PyObject_CallOneArg(), PyObject_CallTwoArgs(), etc. and an even > larger API. PyObject_CallOneArg() also exists: https://docs.python.org/dev/c-api/call.html#c.PyObject_CallOneArg It was added as a private function https://bugs.python.org/issue37483 add made public in commit 3f563cea567fbfed9db539ecbbacfee2d86f7735 "bpo-39245: Make Vectorcall C API public (GH-17893)". But it's missing in What's New in Python 3.9. There is no plan for two or more arguments. > PyObject_GC_IsTracked(). I don't like this. > Shouldn't GC track *all* objects? > Even if it were named PyObject_Cycle_GC_IsTracked() it would be exposing > internal implementation details for no good reason. A cycle GC that > doesn't "track" individual objects, but treats all objects the same > could be more efficient. In which case, what would this mean? > > What is the purpose of PyObject_GC_IsFinalized()? > Third party objects can easily tell if they have been finalized. > Why they would ever need this information is a mystery to me. Did you read the issues which added these functions to see the rationale? https://bugs.python.org/issue40241 I like the "(Contributed by xxx in bpo-xxx.)" in What's New in Python 3.9: it became trivial to find such rationale. Victor -- Night gathers, and now my watch begins. It shall not end until my death. ___ Python-Dev mailing list -- [email protected] To unsubscribe send an email to [email protected] https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/[email protected]/message/QZ2Q7ELTDZUQLVS54T53CPEINWNQB6HF/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: Can we stop adding to the C API, please?
On Wed, Jun 3, 2020 at 2:10 PM Victor Stinner wrote: > For the short term, my plan is to make structure opaque in the limited > C API, before breaking more stuff in the public C API :-) But you're also breaking the public C API: https://github.com/MagicStack/immutables/issues/46 https://github.com/pycurl/pycurl/pull/636 I'm not saying you're wrong to do so, I'm just confused about whether your plan is to break stuff or not and on which timescale. -n -- Nathaniel J. Smith -- https://vorpus.org ___ Python-Dev mailing list -- [email protected] To unsubscribe send an email to [email protected] https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/[email protected]/message/UMGH7AOPW25IXZ7IWD73EKSVYY6ROCLC/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: Can we stop adding to the C API, please?
On Wed, Jun 3, 2020 at 2:13 PM Victor Stinner wrote: > Le mer. 3 juin 2020 à 19:17, Mark Shannon a écrit : > > > I also *added* a bunch of *new* "getter" or "setter" functions to the > > > public C API for my project of hiding implementation details, like > > > making structures opaque: > > > https://docs.python.org/dev/whatsnew/3.9.html#id1 > > > > Adding "setters" is generally a bad idea. > > "getters" can be computed if the underlying field disappears, but the > > same may not be true for setters if the relation is not one-to-one. > > I don't think there are any new setters in 3.9, so it's not an immediate > > problem. > > You're making the assumption that the member can be set directly. But > my plan is to make the structure opaque. In that case, you need > getters and setters for all fields you would like to access. No member > would be accessible directly anymore. > > > `PyThreadState_GetInterpreter()` can't replace `tstate->interp` for two > > reasons. > > 1. There is no way to stop third party C code accessing the internals of > > data structures. We can warn them not to, but that's all. > > 2. The internal layout of C structures has never been part of the API, > > with arguably two exceptions; the PyTypeObject struct and the > > `ob_refcnt` field of PyObject. > > My long term plan is to make all structures opaque :-) So far, > PyInterpreterState structure was made opaque in Python 3.7. It helped > *a lot* the development of Python 3.8 and 3.9, especially for > subinterpreters. And I made PyGC_Head opaque in Python 3.9. > > Examples of issues to make structures opaque: > > PyGC_Head: https://bugs.python.org/issue40241 (done in Python 3.9) > PyObject: https://bugs.python.org/issue39573 > PyTypeObject: https://bugs.python.org/issue40170 > PyThreadState: https://bugs.python.org/issue39947 > PyInterpreterState: https://bugs.python.org/issue35886 (done in Python > 3.8) > > For the short term, my plan is to make structure opaque in the limited > C API, before breaking more stuff in the public C API :-) > Indeed, your plan and the work you've been doing and discussing with other core devs about this (including at multiple sprints and summits) over the past 4+ years is the right one. Our reliance on structs and related cpp macros unfortunately exposed as public is a burden that freezes reasonable CPython VM implementation evolution options. This work moves us away from that into a better place one step at a time without mass disruption. More prior references related to this work are critical reading and should not be overlooked: [2017 "Keeping Python Competitive"] https://lwn.net/Articles/723949/ [2018 "Lets change the C API" thread] https://mail.python.org/archives/list/[email protected]/thread/B67MYCAO4H4AJNMLSWVT3UVFTHSDGQRB/#B67MYCAO4H4AJNMLSWVT3UVFTHSDGQRB [2019 "The C API"] https://pyfound.blogspot.com/2019/06/python-language-summit-lightning-talks-part-2.html [2020-04 "PEP: Modify the C API to hide implementation details" thread - with a lot of links to much earlier 2017 and such references] https://mail.python.org/archives/list/[email protected]/thread/HKM774XKU7DPJNLUTYHUB5U6VR6EQMJF/#HKM774XKU7DPJNLUTYHUB5U6VR6EQMJF and Victors overall https://pythoncapi.readthedocs.io/roadmap.html as referenced a few places in those. It is also worth paying attention to the https://mail.python.org/archives/list/[email protected]/latest mailing list for anyone with a CPython C API interest. -gps > > > PyObject_CallNoArgs() seems harmless. > > Rationalizing the call API has merit, but PyObject_CallNoArgs() > > leads to PyObject_CallOneArg(), PyObject_CallTwoArgs(), etc. and an even > > larger API. > > PyObject_CallOneArg() also exists: > https://docs.python.org/dev/c-api/call.html#c.PyObject_CallOneArg > > It was added as a private function https://bugs.python.org/issue37483 > add made public in commit 3f563cea567fbfed9db539ecbbacfee2d86f7735 > "bpo-39245: Make Vectorcall C API public (GH-17893)". > > But it's missing in What's New in Python 3.9. > > There is no plan for two or more arguments. > > > > PyObject_GC_IsTracked(). I don't like this. > > Shouldn't GC track *all* objects? > > Even if it were named PyObject_Cycle_GC_IsTracked() it would be exposing > > internal implementation details for no good reason. A cycle GC that > > doesn't "track" individual objects, but treats all objects the same > > could be more efficient. In which case, what would this mean? > > > > What is the purpose of PyObject_GC_IsFinalized()? > > Third party objects can easily tell if they have been finalized. > > Why they would ever need this information is a mystery to me. > > Did you read the issues which added these functions to see the > rationale? https://bugs.python.org/issue40241 > > I like the "(Contributed by xxx in bpo-xxx.)" in What's New in Python > 3.9: it became trivial to find such rationale. > > Victor > -- > Night gathers, and now my watch begins. It shall not end until my death. > ___
