[Python-Dev] Can we split PEP 489 (extension module init) ?

2018-08-10 Thread Stefan Behnel
Hi,

coming back to PEP 489 [1], the multi-phase extension module
initialization. We originally designed it as an "all or nothing" feature,
but as it turns out, the "all" part is so difficult to achieve that most
potential users end up with "nothing". So, my question is: could we split
it up so that projects can get at least the main advantages: module spec
and unicode module naming.

PEP 489 is a great protocol in the sense that it allows extension modules
to set themselves up in the same way that Python modules do: load, create
module, execute module code. Without it, creating the module and executing
its code are a single step that is outside of the control of CPython, which
prevents the module from knowing its metadata and CPython from knowing
up-front what the module will actually be.

Now, the problem with PEP 489 is that it requires support for reloading and
subinterpreters at the same time [2]. For this, extension modules must
essentially be free of static global state, which comprises both the module
code itself and any external native libraries that it uses. That is
somewhere between difficult and impossible to achieve. PEP 573 [3] explains
some of the reasons, and lists solutions for some of the issues, but cannot
solve the general problem that some extension modules simply cannot get rid
of their global state, and are therefore inherently incompatible with
reloading and subinterpreters.

I would like the requirement in [2] to be lifted in PEP 489, to make the
main features of the PEP generally available to all extension modules.

The question is then how to opt out of the subinterpreter support. The PEP
explicitly does not allow backporting new init slot functions/feeatures:

"Unknown slot IDs will cause the import to fail with SystemError."

But at least changing this in Py3.8 should be doable and would be really nice.

What do you think?

Stefan



[1] https://www.python.org/dev/peps/pep-0489/
[2]
https://www.python.org/dev/peps/pep-0489/#subinterpreters-and-interpreter-reloading
[3] https://www.python.org/dev/peps/pep-0573/

___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Can we split PEP 489 (extension module init) ?

2018-08-10 Thread Petr Viktorin

On 08/10/18 11:21, Stefan Behnel wrote:

Hi,

coming back to PEP 489 [1], the multi-phase extension module
initialization. We originally designed it as an "all or nothing" feature,
but as it turns out, the "all" part is so difficult to achieve that most
potential users end up with "nothing". So, my question is: could we split
it up so that projects can get at least the main advantages: module spec
and unicode module naming.

PEP 489 is a great protocol in the sense that it allows extension modules
to set themselves up in the same way that Python modules do: load, create
module, execute module code. Without it, creating the module and executing
its code are a single step that is outside of the control of CPython, which
prevents the module from knowing its metadata and CPython from knowing
up-front what the module will actually be.

Now, the problem with PEP 489 is that it requires support for reloading and
subinterpreters at the same time [2]. For this, extension modules must
essentially be free of static global state, which comprises both the module
code itself and any external native libraries that it uses. That is
somewhere between difficult and impossible to achieve. PEP 573 [3] explains
some of the reasons, and lists solutions for some of the issues, but cannot
solve the general problem that some extension modules simply cannot get rid
of their global state, and are therefore inherently incompatible with
reloading and subinterpreters.


Are there any issues that aren't explained in PEP 573?
I don't think Python modules should be *inherently* incompatible with 
subinterpreters. Static global state is perhaps unavoidable in some 
cases, but IMO it should be managed when it's exposed to Python.
If there are issues not in the PEPs, I'd like to collect the concrete 
cases in some document.



I would like the requirement in [2] to be lifted in PEP 489, to make the
main features of the PEP generally available to all extension modules.

The question is then how to opt out of the subinterpreter support. The PEP
explicitly does not allow backporting new init slot functions/feeatures:

"Unknown slot IDs will cause the import to fail with SystemError."

But at least changing this in Py3.8 should be doable and would be really nice.


I don't think we can just silently skip unknown slots -- that would mean 
modules wouldn't be getting features they asked for.
Do you have some more sophisticated model for slots in mind, or is this 
something to be designed?




What do you think?

Stefan



[1] https://www.python.org/dev/peps/pep-0489/
[2]
https://www.python.org/dev/peps/pep-0489/#subinterpreters-and-interpreter-reloading
[3] https://www.python.org/dev/peps/pep-0573/


___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Can we split PEP 489 (extension module init) ?

2018-08-10 Thread Stefan Behnel
Petr Viktorin schrieb am 10.08.2018 um 11:51:
> On 08/10/18 11:21, Stefan Behnel wrote:
>> coming back to PEP 489 [1], the multi-phase extension module
>> initialization. We originally designed it as an "all or nothing" feature,
>> but as it turns out, the "all" part is so difficult to achieve that most
>> potential users end up with "nothing". So, my question is: could we split
>> it up so that projects can get at least the main advantages: module spec
>> and unicode module naming.
>>
>> PEP 489 is a great protocol in the sense that it allows extension modules
>> to set themselves up in the same way that Python modules do: load, create
>> module, execute module code. Without it, creating the module and executing
>> its code are a single step that is outside of the control of CPython, which
>> prevents the module from knowing its metadata and CPython from knowing
>> up-front what the module will actually be.
>>
>> Now, the problem with PEP 489 is that it requires support for reloading and
>> subinterpreters at the same time [2]. For this, extension modules must
>> essentially be free of static global state, which comprises both the module
>> code itself and any external native libraries that it uses. That is
>> somewhere between difficult and impossible to achieve. PEP 573 [3] explains
>> some of the reasons, and lists solutions for some of the issues, but cannot
>> solve the general problem that some extension modules simply cannot get rid
>> of their global state, and are therefore inherently incompatible with
>> reloading and subinterpreters.
> 
> Are there any issues that aren't explained in PEP 573?
> I don't think Python modules should be *inherently* incompatible with
> subinterpreters. Static global state is perhaps unavoidable in some cases,
> but IMO it should be managed when it's exposed to Python.
> If there are issues not in the PEPs, I'd like to collect the concrete cases
> in some document.

There's always the case where an external native library simply isn't
re-entrant and/or requires configuration to be global. I know, there's
static linking and there are even ways to load an external shared library
multiple times, but that's just adding to the difficulties. Let's just
accept that some things are not easy enough to make for a good requirement.


>> I would like the requirement in [2] to be lifted in PEP 489, to make the
>> main features of the PEP generally available to all extension modules.
>>
>> The question is then how to opt out of the subinterpreter support. The PEP
>> explicitly does not allow backporting new init slot functions/feeatures:
>>
>> "Unknown slot IDs will cause the import to fail with SystemError."
>>
>> But at least changing this in Py3.8 should be doable and would be really
>> nice.
> 
> I don't think we can just silently skip unknown slots -- that would mean
> modules wouldn't be getting features they asked for.
> Do you have some more sophisticated model for slots in mind, or is this
> something to be designed?

Sorry for not being clear here. I was asking for changing the assumptions
that PEP 489 makes about modules that claim to support the multi-step
initialisation part of the PEP. Adding a new (flag?) slot was just one idea
for opting out of multi-initialisation support.

Stefan

___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] A Subtle Bug in Class Initializations

2018-08-10 Thread Erik Bray
On Thu, Aug 9, 2018 at 7:21 PM Steve Dower  wrote:
>
> On 09Aug2018 0818, Erik Bray wrote:
> > On Mon, Aug 6, 2018 at 8:11 PM Eddie Elizondo  wrote:
> >> 3) Special case the initialization of PyType_Type and PyBaseObject_Type 
> >> within PyType_Ready to now make all calls to PyVarObject_HEAD_INIT use 
> >> NULL. To enable this a small change within PyType_Ready is needed to 
> >> initialize PyType_Type PyBaseObject:
> >
> > Coincidentally, I just wrote a long-ish blog post explaining in
> > technical details why PyVarObject_HEAD_INIT(&PyType_Type) pretty much
> > cannot work, at least for extension modules (it is not a problem in
> > the core library), on Windows (my post was focused on Cygwin but it is
> > a problem for Windows in general):
> > http://iguananaut.net/blog/programming/windows-data-import.html
> >
> > The TL;DR is that it's not possible on Windows to initialize a struct
> > with a pointer to some other data that's found in another DLL (i.e.
> > &PyType_Type), unless it happens to be a function, as a special case.
> > But &PyType_Type obviously is not, so thinks break.
>
> Great write-up! I think logically it should make sense that you cannot
> initialize a static value from a dynamically-linked library, but you've
> conclusively shown why that's the case. I'm not clear whether it's also
> the case on other OS's, but I don't see why it wouldn't be (unless they
> compile magic load-time resolution).

Thanks!  I'm not sure what you mean by "on other OS's" though.  Do you
mean other OS's that happen to use Windows-style PE/COFF binaries?
Because other than Windows I'm not sure what we care about there.

For ELF binaries, at least on Linux (and probably elsewhere) it the
runtime loader can perform more sophisticated relocations when loading
a binary into memory, including relocating pointers in the binary's
.data section.  This allows it to initialize data in one executable
"A" with pointers to data in another library "B" *before* "A" is
considered fully loaded and executable.

So this problem never arises, at least on Linux.

> > So I'm +1 for requiring passing NULL to PyVarObject_HEAD_INIT,
> > requiring PyType_Ready with an explicit base type argument, and maybe
> > (eventually) making PyVarObject_HEAD_INIT argumentless.
>
> Since PyVarObject_HEAD_INIT currently requires PyType_Ready() in
> extension modules already, then don't we just need to fix the built-in
> types?
>
> As far as the "eventually" case, I'd hope that eventually extension
> modules are all using PyType_FromSpec() :)

+1 :)
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Can we split PEP 489 (extension module init) ?

2018-08-10 Thread Petr Viktorin

On 08/10/18 12:21, Stefan Behnel wrote:

Petr Viktorin schrieb am 10.08.2018 um 11:51:

On 08/10/18 11:21, Stefan Behnel wrote:

coming back to PEP 489 [1], the multi-phase extension module
initialization. We originally designed it as an "all or nothing" feature,
but as it turns out, the "all" part is so difficult to achieve that most
potential users end up with "nothing". So, my question is: could we split
it up so that projects can get at least the main advantages: module spec
and unicode module naming.

PEP 489 is a great protocol in the sense that it allows extension modules
to set themselves up in the same way that Python modules do: load, create
module, execute module code. Without it, creating the module and executing
its code are a single step that is outside of the control of CPython, which
prevents the module from knowing its metadata and CPython from knowing
up-front what the module will actually be.

Now, the problem with PEP 489 is that it requires support for reloading and
subinterpreters at the same time [2]. For this, extension modules must
essentially be free of static global state, which comprises both the module
code itself and any external native libraries that it uses. That is
somewhere between difficult and impossible to achieve. PEP 573 [3] explains
some of the reasons, and lists solutions for some of the issues, but cannot
solve the general problem that some extension modules simply cannot get rid
of their global state, and are therefore inherently incompatible with
reloading and subinterpreters.


Are there any issues that aren't explained in PEP 573?
I don't think Python modules should be *inherently* incompatible with
subinterpreters. Static global state is perhaps unavoidable in some cases,
but IMO it should be managed when it's exposed to Python.
If there are issues not in the PEPs, I'd like to collect the concrete cases
in some document.


There's always the case where an external native library simply isn't
re-entrant and/or requires configuration to be global. I know, there's
static linking and there are even ways to load an external shared library
multiple times, but that's just adding to the difficulties. Let's just
accept that some things are not easy enough to make for a good requirement.


For that case, I think the right thing to do is for the module to raise 
an extension when it's being initialized for the second time, or when 
the underlying library would be initialized for the second time.


"Avoid static global state" is a good rule of thumb for supporting 
subinterpreters nicely, but other strategies are possible.
If an underlying library just expects to be initialized once, and then 
work from several modules, the Python wrapper should ensure that (using 
global state, most likely). Other ways of handling things should be 
possible, depending on the underlying library.



I would like the requirement in [2] to be lifted in PEP 489, to make the
main features of the PEP generally available to all extension modules.

The question is then how to opt out of the subinterpreter support. The PEP
explicitly does not allow backporting new init slot functions/feeatures:

"Unknown slot IDs will cause the import to fail with SystemError."

But at least changing this in Py3.8 should be doable and would be really
nice.


I don't think we can just silently skip unknown slots -- that would mean
modules wouldn't be getting features they asked for.
Do you have some more sophisticated model for slots in mind, or is this
something to be designed?


Sorry for not being clear here. I was asking for changing the assumptions
that PEP 489 makes about modules that claim to support the multi-step
initialisation part of the PEP. Adding a new (flag?) slot was just one idea
for opting out of multi-initialisation support.


Would this be better than a flag + raising an error on init?
One big disadvantage of a big opt-out-of-everything button is that it 
doesn't encourage people to think about what the actual non-reentrant 
piece of code is.

___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Summary of Python tracker Issues

2018-08-10 Thread Python tracker


ACTIVITY SUMMARY (2018-08-03 - 2018-08-10)
Python tracker at https://bugs.python.org/

To view or respond to any of the issues listed below, click on the issue.
Do NOT respond to this message.

Issues counts and deltas:
  open6756 ( -3)
  closed 39367 (+44)
  total  46123 (+41)

Open issues with patches: 2685 


Issues opened (26)
==

#34333: Path.with_suffix() raises TypeError when doing %-formatting
https://bugs.python.org/issue34333  opened by berker.peksag

#34334: QueueHandler logs exc_info twice
https://bugs.python.org/issue34334  opened by avdd

#34338: abstractmethod can run on classes
https://bugs.python.org/issue34338  opened by Michael Hooreman

#34340: mimetypes libmagic compatibility
https://bugs.python.org/issue34340  opened by bpypy

#34341: Appending to ZIP archive blows up existing Central Directory e
https://bugs.python.org/issue34341  opened by serhiy.storchaka

#34344: Fix the docstring for AbstractEventLoopPolicy.get_event_loop
https://bugs.python.org/issue34344  opened by drtyrsa

#34345: Add tests for PEP 468 and PEP 520
https://bugs.python.org/issue34345  opened by serhiy.storchaka

#34346: dir() hangs interpreter
https://bugs.python.org/issue34346  opened by sfaleron

#34347: AIX: test_utf8_mode.test_cmd_line fails
https://bugs.python.org/issue34347  opened by Michael.Felt

#34349: asyncio.wait should accept generator of tasks as first argumen
https://bugs.python.org/issue34349  opened by jnwatson

#34354: Memory leak on _testCongestion
https://bugs.python.org/issue34354  opened by Vinicius Pacheco

#34355: SIGSEGV (Address boundary error)
https://bugs.python.org/issue34355  opened by ybon

#34356: Add support for args and kwargs in logging.conf
https://bugs.python.org/issue34356  opened by xavier.hardy

#34357: situation where urllib3 works, but urllib does not work
https://bugs.python.org/issue34357  opened by deivid

#34360: urllib.parse doesn't fully comply to RFC 3986
https://bugs.python.org/issue34360  opened by The Compiler

#34362: User-created types with wrong __new__ can be instantiated
https://bugs.python.org/issue34362  opened by ppperry

#34363: dataclasses.asdict() mishandles dataclass instance attributes 
https://bugs.python.org/issue34363  opened by alexdelorenzo

#34364: problem with traceback for syntax error in f-string
https://bugs.python.org/issue34364  opened by bgailer

#34365: datetime's documentation refers to "comparison [...] falling b
https://bugs.python.org/issue34365  opened by Kevin.Norris

#34366: _uuid module fails to compile on FreeBSD when libuuid is insta
https://bugs.python.org/issue34366  opened by mgorny

#34367: AsyncResult.get() only notifies one thread
https://bugs.python.org/issue34367  opened by AlexWithBeard

#34368: ftplib __init__ function can't handle 120 or 4xy reply when co
https://bugs.python.org/issue34368  opened by H-ZeX

#34369: kqueue.control() documentation and implementation mismatch
https://bugs.python.org/issue34369  opened by a.badger

#34370: Tkinter scroll issues on macOS
https://bugs.python.org/issue34370  opened by vtudorache

#34371: File reading gets stuck if you read at eof on macos
https://bugs.python.org/issue34371  opened by sverrirab

#34372: Compiler could output more accurate line numbers
https://bugs.python.org/issue34372  opened by Arusekk



Most recent 15 issues with no replies (15)
==

#34372: Compiler could output more accurate line numbers
https://bugs.python.org/issue34372

#34370: Tkinter scroll issues on macOS
https://bugs.python.org/issue34370

#34368: ftplib __init__ function can't handle 120 or 4xy reply when co
https://bugs.python.org/issue34368

#34367: AsyncResult.get() only notifies one thread
https://bugs.python.org/issue34367

#34366: _uuid module fails to compile on FreeBSD when libuuid is insta
https://bugs.python.org/issue34366

#34365: datetime's documentation refers to "comparison [...] falling b
https://bugs.python.org/issue34365

#34357: situation where urllib3 works, but urllib does not work
https://bugs.python.org/issue34357

#34356: Add support for args and kwargs in logging.conf
https://bugs.python.org/issue34356

#34354: Memory leak on _testCongestion
https://bugs.python.org/issue34354

#34345: Add tests for PEP 468 and PEP 520
https://bugs.python.org/issue34345

#34344: Fix the docstring for AbstractEventLoopPolicy.get_event_loop
https://bugs.python.org/issue34344

#34341: Appending to ZIP archive blows up existing Central Directory e
https://bugs.python.org/issue34341

#34340: mimetypes libmagic compatibility
https://bugs.python.org/issue34340

#34334: QueueHandler logs exc_info twice
https://bugs.python.org/issue34334

#34333: Path.with_suffix() raises TypeError when doing %-formatting
https://bugs.python.org/issue34333



Most recent 15 issues waiting for review (15)
=

#34366: _uuid module fails to compile on FreeBSD when libuuid is insta
https://bugs.python.org/issue34

Re: [Python-Dev] A Subtle Bug in Class Initializations

2018-08-10 Thread Steve Dower

On 10Aug2018 0354, Erik Bray wrote:

Thanks!  I'm not sure what you mean by "on other OS's" though.  Do you
mean other OS's that happen to use Windows-style PE/COFF binaries?
Because other than Windows I'm not sure what we care about there.

For ELF binaries, at least on Linux (and probably elsewhere) it the
runtime loader can perform more sophisticated relocations when loading
a binary into memory, including relocating pointers in the binary's
.data section.  This allows it to initialize data in one executable
"A" with pointers to data in another library "B" *before* "A" is
considered fully loaded and executable.

So this problem never arises, at least on Linux.


That's exactly what I meant. I simply didn't know how/whether other 
loaders handled this case :) I recognise it's nothing to do with the 
binary format and everything to do with whether the loader knows what to 
do or not.



So I'm +1 for requiring passing NULL to PyVarObject_HEAD_INIT,
requiring PyType_Ready with an explicit base type argument, and maybe
(eventually) making PyVarObject_HEAD_INIT argumentless.


Since PyVarObject_HEAD_INIT currently requires PyType_Ready() in
extension modules already, then don't we just need to fix the built-in
types?

As far as the "eventually" case, I'd hope that eventually extension
modules are all using PyType_FromSpec() :)


+1 :)


Is that just a +1 for PyType_FromSpec(), or are you agreeing that we 
only need to fix the built-in types?


Cheers,
Steve
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Let's change to C API!

2018-08-10 Thread Armin Rigo
Hi,

On 31 July 2018 at 13:55, Antoine Pitrou  wrote:
> It's just that I disagree that removing the C API will make CPython 2x
> faster.
>
> Actually, important modern optimizations for dynamic languages (such as
> inlining, type specialization, inline caches, object unboxing) don't
> seem to depend on the C API at all.

These are optimizations typically talked about in papers about dynamic
languages in general.  In my opinion, in the specific case of CPython,
they are all secondary to the following: (1) JIT, (2) GC, (3) object
model, (4) multithreading.

Currently, the C API only allows Psyco-style JITting (much slower than
PyPy).  All three other points might not be possible at all without a
seriously modified C API.  Why?  I have no proof, but only
circumstantial evidence.  Each of (2), (3), (4) has been done in at
least one other implementation: PyPy, Jython and IronPython.  Each of
these implementation has also got its share of troubles with emulating
the CPython C API.  You can continue to think that the C API has got
nothing to do with it.  I tend to think the opposite.  The continued
absence of major performance improvements for either CPython itself or
for any alternative Python implementation that *does* support the C
API natively is probably proof enough---I think that enough time has
passed, by now, to make this argument.


A bientôt,

Armin.
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com