Re: [Python-Dev] General concerns about C API changes

2018-11-23 Thread Victor Stinner
Le dim. 18 nov. 2018 à 17:54, Stefan Behnel  a écrit :
> It's also slower to compile, given that function inlining happens at a much
> later point in the compiler pipeline than macro expansion. The C compiler
> won't even get to see macros in fact, whereas whether to inline a function
> or not is a dedicated decision during the optimisation phase based on
> metrics collected in earlier stages. For something as ubiquitous as
> Py_INCREF/Py_DECREF, it might even be visible in the compilation times.

I ran a benchmark: there is no significant slowdown (+4 seconds, 6%
slower, in the worst case).
https://bugs.python.org/issue35059#msg330316


> Now imagine that you have an inline function that executes several
> Py_INCREF/Py_DECREF call cycles, and the C compiler happens to slightly
> overestimate the weights of these two. Then it might end up deciding
> against inlining the function now, whereas it previously might have decided
> for it since it was able to see the exact source code expanded from the
> macros. I think that's what Raymond meant with his concerns regarding
> changing macros into inline functions. C compilers might be smart enough to
> always inline CPython's new inline functions themselves, but the style
> change can still have unexpected transitive impacts on code that uses them.

I ran the performance benchmark suite to compare C macros to static
inline functions: there is no significant impact on performance.
https://bugs.python.org/issue35059#msg330302


> I agree with Raymond that as long as there is no clear gain in this code
> churn, we should not underestimate the risk of degarding code on user side.

I don't understand how what you mean with "degarding code on user
side". If you are talking about performance, again, my changes have no
significant impact on performance (not on compilation time nor runtime
performance).


> "there is no clear gain in this code churn"

There are multiple advantages:

* Better development and debugging experience: tools understand
inlined functions much better than C macros: gdb, Linux perf, etc.

* Better API: arguments now have a type and the function has a return
type. In practice, some macros still cast their argument to PyObject*
to not introduce new compiler warnings in Python 3.8. For example,
even if Py_INCREF() is documented (*) as a function expecting
PyObject*, it accepts any pointer type (PyTupleObject*,
PyUnicodeObject*, etc.). Technically, it also accepts PyObject** which
is a bug, but that's a different story ;-)

* Much better code, just plain regular C. C macros are ugly: "do { ...
} while (0)" workaround, additional parenthesis around each argument,
strange "expr1, expr2" syntax of "macro expression" which returns a
value (inline function just uses regular "return" and ";" at the end
of instructions), strange indentation, etc.

* No more "macro pitfals":
https://gcc.gnu.org/onlinedocs/cpp/Macro-Pitfalls.html

* Local variables no longer need a magic name to avoid risk of name
conflict, and have a clearly defined scope. Py_DECREF() and
_Py_XINCREF() no longer need a local variable since it's argument
already has a clearly defined type: PyObject*. I introduced a new
variable in _Py_Dealloc() to fix a possible race condition.
Previously, the variable was probably avoided because it's tricky use
variables in macros.

* #ifdef can now be used inside the inline function: it makes the code
easier to understand.

* etc.


Are you aware that Python had macros like:

#define _Py_REF_DEBUG_COMMA ,
#define _Py_CHECK_REFCNT(OP) /* a semicolon */;

I let you judge the quality of this macro:

#define _Py_NewReference(op) (  \
_Py_INC_TPALLOCS(op) _Py_COUNT_ALLOCS_COMMA \
_Py_INC_REFTOTAL  _Py_REF_DEBUG_COMMA   \
Py_REFCNT(op) = 1)

Is it an expression? Can it be used in "if (test)
_Py_NewReference(op);"? It doesn't use the "do { ... } while (0)"
protection against macro pitfals.

(*) Py_INCREF doc:
https://docs.python.org/dev/c-api/refcounting.html#c.Py_INCREF

Victor
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] C API changes

2018-11-23 Thread Armin Rigo
Hi Hugo, hi all,

On Sun, 18 Nov 2018 at 22:53, Hugh Fisher  wrote:
> I suggest that for the language reference, use the license plate
> or registration analogy to introduce "handle" and after that use
> handle throughout. It's short, distinctive, and either will match
> up with what the programmer already knows or won't clash if
> or when they encounter handles elsewhere.

FWIW, a "handle" is typically something that users of an API store and
pass around, and which can be used to do all operations on some
object.  It is whatever a specific implementation needs to describe
references to an object.  In the CPython C API, this is ``PyObject*``.
I think that using "handle" for something more abstract is just going
to create confusion.

Also FWIW, my own 2 cents on the topic of changing the C API: let's
entirely drop ``PyObject *`` and instead use more opaque
handles---like a ``PyHandle`` that is defined as a pointer-sized C
type but is not actually directly a pointer.  The main difference this
would make is that the user of the API cannot dereference anything
from the opaque handle, nor directly compare handles with each other
to learn about object identity.  They would work exactly like Windows
handles or POSIX file descriptors.  These handles would be returned by
C API calls, and would need to be closed when no longer used.  Several
different handles may refer to the same object, which stays alive for
at least as long as there are open handles to it.  Doing it this way
would untangle the notion of objects from their actual implementation.
In CPython objects would internally use reference counting, a handle
is really just a PyObject pointer in disguise, and closing a handle
decreases the reference counter.  In PyPy we'd have a global table of
"open objects", and a handle would be an index in that table; closing
a handle means writing NULL into that table entry.  No emulated
reference counting needed: we simply use the existing GC to keep alive
objects that are referenced from one or more table entries.  The cost
is limited to a single indirection.

The C API would change a lot, so it's not reasonable to do that in the
CPython repo.  But it could be a third-party project, attempting to
define an API like this and implement it well on top of both CPython
and PyPy.  IMHO this might be a better idea than just changing the API
of functions defined long ago to make them more regular (e.g. stop
returning borrowed references); by now this would mostly mean creating
more work for the PyPy team to track and adapt to the changes, with no
real benefits.


A bientôt,

Armin.
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Summary of Python tracker Issues

2018-11-23 Thread Python tracker

ACTIVITY SUMMARY (2018-11-16 - 2018-11-23)
Python tracker at https://bugs.python.org/

To view or respond to any of the issues listed below, click on the issue.
Do NOT respond to this message.

Issues counts and deltas:
  open6865 ( +2)
  closed 40187 (+33)
  total  47052 (+35)

Open issues with patches: 2735 


Issues opened (27)
==

#35148: cannot activate a venv environment on a Swiss German windows
https://bugs.python.org/issue35148  reopened by vinay.sajip

#35267: reproducible deadlock with multiprocessing.Pool
https://bugs.python.org/issue35267  opened by dzhu

#35268: Windows asyncio reading continously stdin and stdout Stockfish
https://bugs.python.org/issue35268  opened by Cezary.Wagner

#35270: Cmd.complete does not handle cmd=None
https://bugs.python.org/issue35270  opened by blueyed

#35271: venv creates pyvenv.cfg with wrong home
https://bugs.python.org/issue35271  opened by wvxvw

#35272: sqlite3 get the connected database url
https://bugs.python.org/issue35272  opened by midnio

#35276: Document thread safety
https://bugs.python.org/issue35276  opened by vstinner

#35277: Upgrade bundled pip/setuptools
https://bugs.python.org/issue35277  opened by dstufft

#35278: [security] directory traversal in tempfile prefix
https://bugs.python.org/issue35278  opened by Yusuke Endoh

#35279: asyncio uses too many threads by default
https://bugs.python.org/issue35279  opened by Vojtěch Boček

#35280: Interactive shell overwrites history
https://bugs.python.org/issue35280  opened by dingens

#35281: Allow access to unittest.TestSuite tests
https://bugs.python.org/issue35281  opened by lbenezriravin

#35282: Add a return value to lib2to3.refactor.refactor_file and refac
https://bugs.python.org/issue35282  opened by martindemello

#35283: "threading._DummyThread" redefines "is_alive" but forgets "isA
https://bugs.python.org/issue35283  opened by dmaurer

#35284: Incomplete error handling in Python/compile.c:compiler_call()
https://bugs.python.org/issue35284  opened by ZackerySpytz

#35285: Make Proactor api extensible for reasonably any file handle
https://bugs.python.org/issue35285  opened by Ignas Brašiškis

#35286: wrong result for difflib.SequenceMatcher
https://bugs.python.org/issue35286  opened by Boris Yang

#35291: duplicate of memoryview from io.BufferedWriter leaks
https://bugs.python.org/issue35291  opened by jmadden

#35292: Make SimpleHTTPRequestHandler load mimetypes lazily
https://bugs.python.org/issue35292  opened by steve.dower

#35293: make doctest (Sphinx) emits a lot of warnings
https://bugs.python.org/issue35293  opened by vstinner

#35294: Race condition involving SocketServer.TCPServer
https://bugs.python.org/issue35294  opened by Ruslan Dautkhanov

#35295: Please clarify whether PyUnicode_AsUTF8AndSize() or PyUnicode_
https://bugs.python.org/issue35295  opened by Marcin Kowalczyk

#35297: untokenize documentation is not correct
https://bugs.python.org/issue35297  opened by csernazs

#35298: Segfault in _PyObject_GenericGetAttrWithDict
https://bugs.python.org/issue35298  opened by gilado

#35299: LGHT0091: Duplicate symbol 'File:include_pyconfig.h' found
https://bugs.python.org/issue35299  opened by neyuru

#35300: Bug with memoization and mutable objects
https://bugs.python.org/issue35300  opened by bolorsociedad

#35301: python.exe crashes - lzma?
https://bugs.python.org/issue35301  opened by jonathan-lp



Most recent 15 issues with no replies (15)
==

#35301: python.exe crashes - lzma?
https://bugs.python.org/issue35301

#35299: LGHT0091: Duplicate symbol 'File:include_pyconfig.h' found
https://bugs.python.org/issue35299

#35298: Segfault in _PyObject_GenericGetAttrWithDict
https://bugs.python.org/issue35298

#35297: untokenize documentation is not correct
https://bugs.python.org/issue35297

#35295: Please clarify whether PyUnicode_AsUTF8AndSize() or PyUnicode_
https://bugs.python.org/issue35295

#35294: Race condition involving SocketServer.TCPServer
https://bugs.python.org/issue35294

#35291: duplicate of memoryview from io.BufferedWriter leaks
https://bugs.python.org/issue35291

#35285: Make Proactor api extensible for reasonably any file handle
https://bugs.python.org/issue35285

#35284: Incomplete error handling in Python/compile.c:compiler_call()
https://bugs.python.org/issue35284

#35282: Add a return value to lib2to3.refactor.refactor_file and refac
https://bugs.python.org/issue35282

#35280: Interactive shell overwrites history
https://bugs.python.org/issue35280

#35279: asyncio uses too many threads by default
https://bugs.python.org/issue35279

#35270: Cmd.complete does not handle cmd=None
https://bugs.python.org/issue35270

#35264: SSL Module build fails with OpenSSL 1.1.0 for Python 2.7
https://bugs.python.org/issue35264

#35263: Add None handling for get_saved() in IDLE
https://bugs.python.org/issue35263



Most recent 15 issues waiting for review (15)
=

#35

[Python-Dev] C API changes

2018-11-23 Thread Stefan Krah


Armin Rigo wrote:
> The C API would change a lot, so it's not reasonable to do that in the
> CPython repo.  But it could be a third-party project, attempting to
> define an API like this and implement it well on top of both CPython
> and PyPy.  IMHO this might be a better idea than just changing the API
> of functions defined long ago to make them more regular (e.g. stop
> returning borrowed references); by now this would mostly mean creating
> more work for the PyPy team to track and adapt to the changes, with no
> real benefits.

I like this idea.  For example, when writing two versions of a C module,
one that uses CPython internals indiscriminately and another that uses
a "clean" API, such a third-party project would help.

I'd also be more motivated to write two versions if I know that the
project is supported by PyPy devs.


Do you think that such an API might be faster than CFFI on PyPy?


Stefan Krah



___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com