[Python-Dev] Re: REPL output bug

2020-06-16 Thread Xavier Morel
> On 16 Jun 2020, at 08:51, Greg Ewing  wrote:
> 
> On 16/06/20 12:20 pm, Steven D'Aprano wrote:
>> The whole point of the REPL is to evaluate an
>> expression and have the result printed. (That's the P in REPL :-)
> 
> Still, it's a bit surprising that it prints results of
> expressions within a compound statement, not just at the
> top level.

For what that’s worth, 2.7 seems to have the same behaviour, every statement 
with a non-None result gets echoed even if it is not the top level statement 
e.g. 

Python 2.7.17 (default, April 15 2020, 17:20:14)
[GCC 7.5.0] on linux2
Type “help”, “copyright”, “credits” or “license” for more information. 
>>> for i in range(3): i
...
0
1
2
>>>
___
Python-Dev mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/[email protected]/message/HSU7EIKRX37SZ4TZPG6N52YKVLETZRN3/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: When can we remove wchar_t* cache from string?

2020-06-16 Thread Inada Naoki
On Tue, Jun 16, 2020 at 12:35 AM Victor Stinner  wrote:
>
> Hi INADA-san,
>
> IMO Python 3.11 is too early because we don't emit a
> DeprecationWarning on every single deprecation function.
>
> 1) Emit a DeprecationWarning at runtime (ex: Python 3.10)
> 2) Wait two Python releases: see
> https://discuss.python.org/t/pep-387-backwards-compatibilty-policy/4421
> 3) Remove the deprecated feature (ex: Python 3.12)
>

Hmm,  Is there any chance to add DeprecationWarning in Python 3.9?

* They are deprecated in document since Python 3.3 (2012)
* As far as grepping PyPI sdist sources, I feel two years may be
enough to remove them.
* We can postpone the schedule anyway.

> I don't understand if *all* deprecated functions are causing
> implementation issues, or only a few of them?

Of course.  I meant only APIs using PyASCIIObject.wstr.
As far as I know,

* PyUnicode_AS_DATA
* PyUnicode_AS_UNICODE
* PyUnicode_AsUnicode
* PyUnicode_AsUnicodeAndSize
* PyUnicode_FromUnicode(NULL, size)
* PyUnicode_FromStringAndSize(NULL, size)
* PyUnicode_GetSize
* PyUnicode_GET_SIZE
* PyUnicode_GET_DATA_SIZE
* PyUnicode_WSTR_LENGTH
* PyArg_ParseTuple, and PyArg_ParseTupleAndTuple with format 'u' or  'Z'.

>
> PyUnicode_AS_UNICODE() initializes PyASCIIObject.wstr if needed, and
> then return PyASCIIObject.wstr. I don't think that PyASCIIObject.wstr
> can be called "a cache": there are functions relying on this member.
>

OK, I will call it wstr, instead of wchar_t* cache.

> On the other hand, PyUnicode_FromUnicode(str, size) is basically a
> wrapper to PyUnicode_FromWideChar(): it doesn't harm to keep this
> wrapper to ease migration. Only PyUnicode_FromUnicode(NULL, size) is
> causing troubles, right?

You're right.

>
> Is there a list of deprecated functions and is it possible to group
> them in two categories: must be removed and "can be kept for a few
> more releases"?
>
> If the intent is to reduce Python memory footprint, PyASCIIObject.wstr
> can be moved out of PyASCIIObject structure, maybe we can imagine a
> WeakDict. It would map a Python str object to its wstr member (wchar_*
> string). If the Python str object is removed, we can release the wstr
> string. The technical problem is that it is not possible to create a
> weak reference to a Python str. We may insert code in
> unicode_dealloc() to delete manually the wstr in this case. Maybe a
> _Py_hashtable_t of pycore_hashtable.h could be used for that.
>

It is an interesting idea, but I think it is too complex.
Fixing all packages in the PyPI would be a better approach.

> Since this discussion is on-going for something like 5 years in
> multiple bugs.python.org issues and email threads, maybe it would help
> to have a short PEP describing issues of the deprecated functions,
> explain the plan to migrate to the new functions, and give a schedule
> of the incompatible changes. INADA-san: would you be a candidate to
> write such PEP?
>

OK, I will try to write it.

-- 
Inada Naoki  
___
Python-Dev mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/[email protected]/message/XLAXDWG6BQ4GXQKOUCOCSCVSGTAA4GX3/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: REPL output bug

2020-06-16 Thread Ivan Pozdeev via Python-Dev


On 16.06.2020 1:40, Joseph Jenne via Python-Dev wrote:

On 2020-06-15 15:26, Ivan Pozdeev via Python-Dev wrote:


On 12.06.2020 11:01, Rob Cliffe via Python-Dev wrote:

If I run the following program (using Python 3.8.3 on a Windows 10 laptop):

import sys, time
for i in range(1,11):
    sys.stdout.write('\r%d' % i)
    time.sleep(1)

As intended, it displays '1', replacing it at 1-second intervals with '2', '3' 
... '10'.

Now run the same code inside the REPL:

Python 3.8.3 (tags/v3.8.3:6f8c832, May 13 2020, 22:20:19) [MSC v.1925 32 bit 
(Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys, time
>>> for i in range(1,11):
... sys.stdout.write('\r%d' % i)
... time.sleep(1)
...
12
22
32
42
52
62
72
82
92
103
>>>

It appears that the requested characters are output, *followed by* the number 
of characters output
(which is the value returned by sys.stdout.write) and a newline.
Surely this is not the intended behaviour.
sys.stderr behaves the same as sys.stdout.



3.7.4 win64 works as expected (i.e. prints and overwrites only the numbers) and I see nothing relevant in 
https://docs.python.org/3/whatsnew/3.8.html


So I'd say this is a bug.



Python 2 does NOT exhibit this behaviour:

Python 2.7.18 (v2.7.18:8d21aa21f2, Apr 20 2020, 13:19:08) [MSC v.1500 32 bit 
(Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys, time
>>> for i in range(1,11):
... sys.stdout.write('\r%d' % i)
... time.sleep(1)
...
10>>>
# displays '1', '2' ... '10' as intended.

Best wishes
Rob Cliffe
___
Python-Dev mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/[email protected]/message/T3BVZMBLWK3PMLRL2XCQFMLATGRTUPYE/
Code of Conduct: http://python.org/psf/codeofconduct/
--
Regards,
Ivan

___
Python-Dev mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/[email protected]/message/6BS3GXCS63PTXZMOMZSC6M5DHNAJO5FX/
Code of Conduct: http://python.org/psf/codeofconduct/


I just tested with 3.7.3 and got the same results as originally described. Is 
there something different about 3.7.4 on win7?



"Something different" is that I ran this in IPython :\


Python 3.7.3 (default, Dec 20 2019, 18:57:59)
[GCC 8.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys
>>> import time
>>> for i in range(1,11):
... sys.stdout.write('\r%d' % i)
... time.sleep(1)
...
12
22
32
42
52
62
72
82
92
103
(I also tested with 3.8.3 with the same results)

Joseph J
___
Python-Dev mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/[email protected]/message/TOSIE4NBT3LVO2ZNUM7FDLTWIIYNJWSP/
Code of Conduct: http://python.org/psf/codeofconduct/
--
Regards,
Ivan

___
Python-Dev mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/[email protected]/message/YJXGEYC25GVGN4PA23DW2L4423KGBIMM/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: When can we remove wchar_t* cache from string?

2020-06-16 Thread Antoine Pitrou
On Mon, 15 Jun 2020 11:22:09 +0100
Mark Shannon  wrote:
> 
> I don't like this approach.
> Adding compile time options means we need to test more versions, but is 
> no help to end users as they will end up with the release version anyway.

I agree with Mark.  This sounds less pointless complication and undue
maintenance overhead.

I would like to propose the opposite approach.  Simply remove those
functions and the wchar_t cache now.  They have been deprecated since
3.3.  Yes, there's going to be a bit of pain for a couple downstream
projects (mostly Cython), but it should be minor anyway.

Regards

Antoine.

___
Python-Dev mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/[email protected]/message/UIUNG7N6MSNTL32SUHJ3VMC4HSOWJEGD/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Should we be making so many changes in pursuit of PEP 554?

2020-06-16 Thread Nick Coghlan
Multiprocessing serialisation overheads are abysmal. With enough OS support
you can attempt to mitigate that via shared memory mechanisms (which Davin
added to the standard library), but it's impossible to get the overhead of
doing that as low as actually using the address space of one OS process.

For the rest of the email... multiprocessing isn't going anywhere.

Within-process parallelism is just aiming to provide another trade-off
point in design space for CPU bound workloads (one roughly comparable to
the point where JS web workers sit).

Cheers,
Nick.

On Sat., 6 Jun. 2020, 12:39 am Mark Shannon,  wrote:

> Hi,
>
> There have been a lot of changes both to the C API and to internal
> implementations to allow multiple interpreters in a single O/S process.
>
> These changes cause backwards compatibility changes, have a negative
> performance impact, and cause a lot of churn.
>
> While I'm in favour of PEP 554, or some similar model for parallelism in
> Python, I am opposed to the changes we are currently making to support it.
>
>
> What are sub-interpreters?
> --
>
> A sub-interpreter is a logically independent Python process which
> supports inter-interpreter communication built on shared memory and
> channels. Passing of Python objects is supported, but only by copying,
> not by reference. Data can be shared via buffers.
>
>
> How can they be implemented to support parallelism?
> ---
>
> There are two obvious options.
> a) Many sub-interpreters in a single O/S process. I will call this the
> many-to-one model (many interpreters in one O/S process).
> b) One sub-interpreter per O/S process. This is what we currently have
> for multiprocessing. I will call this the one-to-one model (one
> interpreter in one O/S process).
>
> There seems to be an assumption amongst those working on PEP 554 that
> the many-to-one model is the only way to support sub-interpreters that
> can execute in parallel.
> This isn't true. The one-to-one model has many advantages.
>
>
> Advantages of the one-to-one model
> --
>
> 1. It's less bug prone. It is much easier to reason about code working
> in a single address space. Most code assumes
>
> 2. It's more secure. Separate O/S processes provide a much stronger
> boundary between interpreters. This is why some browsers use separate
> processes for browser tabs.
>
> 3. It can be implemented on top of the multiprocessing module, for
> testing. A more efficient implementation can be developed once
> sub-interpreters prove useful.
>
> 4. The required changes should have no negative performance impact.
>
> 5. Third party modules should continue to work as they do now.
>
> 6. It takes much less work :)
>
>
> Performance
> ---
>
> Creating O/S processes is usually considered to be slow. Whilst
> processes are undoubtedly slower to create than threads, the absolute
> time to create a process is small; well under 1ms on linux.
>
> Creating a new sub-interpreter typically requires importing quite a few
> modules before any useful work can be done.
> The time spent doing these imports will dominate the time to create an
> O/S process or thread.
>
> If sub-interpreters are to be used for parallelism, there is no need to
> have many more sub-interpreters than CPU cores, so the overhead should
> be small. For additional concurrency, threads or coroutines can be used.
>
> The one-to-one model is faster as it uses the hardware for interpreter
> separation, whereas the many-to-one model must use software.
> Process separation by the hardware virtual memory system has zero cost.
> Separation done in software needs extra memory reads when doing
> allocation or deallocation.
>
> Overall, for any interpreter that runs for a second or more, it is
> likely that the one-to-one model would be faster.
>
>
> Timings of multiprocessing & threads on my machine (6-core 2019 laptop)
> ---
>
> #Threads
>
> def foo():
>  pass
>
> def spawn_and_join(count):
>  threads = [ Thread(target=foo, args=()) for _ in range(count) ]
>  for t in threads:
>  t.start()
>  for t in threads:
>  t.join()
>
> spawn_and_join(1000)
>
> # Processes
>
> def spawn_and_join(count):
>  processes = [ Process(target=foo, args=()) for _ in range(count) ]
>  for p in processes:
>  p.start()
>  for p in processes:
>  p.join()
>
> spawn_and_join(1000)
>
> Wall clock time for threads:
> 86ms. Less than 0.1ms per thread.
>
> Wall clock time for processes:
> 370ms. Less than 0.4ms per process.
>
> Processes are slower, but plenty fast enough.
>
>
> Cheers,
> Mark.
>
>
>
>
> ___
> Python-Dev mailing list -- [email protected]
> To unsubscribe send an email to [email protected]
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message arc

[Python-Dev] Re: When can we remove wchar_t* cache from string?

2020-06-16 Thread Victor Stinner
Le mar. 16 juin 2020 à 10:42, Inada Naoki  a écrit :
> Hmm,  Is there any chance to add DeprecationWarning in Python 3.9?

In my experience, more and more projects are running their test suite
with -Werror, which is a good thing. Introducing a new warning is
likely to "break" many of these projects. For example, in Fedora, we
run the test suite when we build a package. If a test fails, the
package build fails and we have to decide to either ignore the failing
tests (not good) or find a solution to repair the tests (update the
code base to new C API functions).


> It is an interesting idea, but I think it is too complex.
> Fixing all packages in the PyPI would be a better approach.

It's not the first time that we have to take such decision. "Fixing
all PyPI packages" is not possible. Python core developers are limited
are so we can only port a very low number of packages. Breaking
packages on purpose force developers to upgrade their code base, it
should work better than deprecation warnings. But it is likely to make
some people unhappy.

Having a separated hash table would prevent to break many PyPI
packages by continuing to provide the backward compatibility. We can
even consider to disable it by default, but provide a temporary option
to opt-in for backward compatibility. For example, "python3.10 -X
unicode_compat".

I proposed sys.set_python_compat_version(version) in the rejected PEP
606, but this PEP was too broad:
https://www.python.org/dev/peps/pep-0606/

The question is if it's worth it to pay the maintenance burden on the
Python side, or to drop backward compatibility if it's "too
expensive".

I understood that your first motivation is to reduce PyASCIObject
structure size. Using a hash table, the overhead would only be paid by
users of the deprecated functions. But it requires to keep the code
and so continue to maintain it. Maybe I missed some drawbacks.

Victor
-- 
Night gathers, and now my watch begins. It shall not end until my death.
___
Python-Dev mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/[email protected]/message/5QIMWGFHGH2IXXXTCC6OSBNVK3XNR5M4/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Are extension types allowed to implement both nb_add and sq_concat?

2020-06-16 Thread Nick Coghlan
On Sat., 13 Jun. 2020, 2:51 am Barry Warsaw,  wrote:

> On Jun 12, 2020, at 04:21, Eric Wieser 
> wrote:
>
> > It seems to me there are at least three stances that could be taken here:
> >
> > * Specifying both is considered invalid: python should consider emitting
> a warning in `Type_READY` if both are filled.
> > * Specifying both is considered an implementation detail specific to
> CPython: the [C API docs for the type slots][2] should indicate this
> > * Specifying both is explicitly allowed and considered a language
> feature. `__concat__` should be added as a slot_wrapper around `sq_concat`
> to allow the language feature to be accessed without writing C extensions.
>
> If you can define the behavior at the C layer and not at the Python layer,
> then I think almost by definition it’s an implementation detail of
> CPython.  Whether that’s intentional or not would help determine whether
> this should be demoted to a bug with an emitted warning, or promoted to a
> full-on feature supported in the language.
>

The one part that has already sort of leaked through to the language level
is that only "nb_add" fully supports operand coercion.

https://bugs.python.org/issue11477 covers some of the oddities that arise
when implementing *only* sq_concat.

I've mostly given up on ever fixing that, as I'm not sure we can do it
without breaking the workarounds that people have in place for the oddities
in the status quo (there were enough other projects relying on the existing
semantics that the PyPy devs found it necessary to mimic CPython's
implementation quirks in this regard).

That said, adding an sq_concat that raises a "don't do that" exception to
an existing type that already implements nb_add sounds fine to me - from
the Python level everything except operator.concat would encounter the
nb_add slot first, so the exception should only trip up genuinely
problematic code.

Cheers,
Nick.



> Cheers,
> -Barry
>
> ___
> Python-Dev mailing list -- [email protected]
> To unsubscribe send an email to [email protected]
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at
> https://mail.python.org/archives/list/[email protected]/message/DO5J63K3ACFCV63SRD6QAHT6PAIMLNKR/
> Code of Conduct: http://python.org/psf/codeofconduct/
>
___
Python-Dev mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/[email protected]/message/VFYEMJH5PXKVI72G33DHTSUW3CLWWFPN/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Improving inspect capabilities for classes

2020-06-16 Thread Thomas Viehmann

Hello,

thank you for your feedback!

I could think of a trick that inspect.getsource() might use if the class
contains at least one method: it could look at a method and try its
`__code__.co_filename` attribute (maybe only if the `__file__` attribute
for the module found via the class's `__module__` doesn't exist -- I'm sure
Jupyter can arrange for that to be the case). But I see how that would be a
problem (I can think of plenty of reasons why a class might not have any
methods).


That is a great idea, in addition getsource would have to filter out 
inherited methods (which should be doable, but indeed exclude classes). 
Would you prefer such a patch to inspect.getsource over the adding 
__filename__?



I do think that your proposal is reasonable, although I wonder what the
Jupyter developers think of it. (How closely are you connected to that
project?)


I am not affiliated with Jupyter at all and I imagine that I'd be prone 
to asking "would it be nice if inspect.getsource worked better?", which 
likely doesn't yield the most interesting answers.


We can chase the related reports:

https://github.com/jupyter/notebook/issues/3802
https://github.com/ipython/ipython/issues/11249

The topic does seem to pop up now and then:

https://stackoverflow.com/questions/51566497/getting-the-source-of-an-object-defined-in-a-jupyter-notebook
https://stackoverflow.com/questions/35854373/python-get-source-code-of-class-using-inspect

Best regards

Thomas
___
Python-Dev mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/[email protected]/message/UL376F3Y6FKJUZ2HZDA3ERFUDDOX67X4/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: When can we remove wchar_t* cache from string?

2020-06-16 Thread Inada Naoki
On Tue, Jun 16, 2020 at 9:30 PM Victor Stinner  wrote:
>
> Le mar. 16 juin 2020 à 10:42, Inada Naoki  a écrit :
> > Hmm,  Is there any chance to add DeprecationWarning in Python 3.9?
>
> In my experience, more and more projects are running their test suite
> with -Werror, which is a good thing. Introducing a new warning is
> likely to "break" many of these projects. For example, in Fedora, we
> run the test suite when we build a package. If a test fails, the
> package build fails and we have to decide to either ignore the failing
> tests (not good) or find a solution to repair the tests (update the
> code base to new C API functions).
>

But Python 3.9 is still in beta phase and we have enough time to get feedback.
If the new warning is unacceptable breakage, we can remove it in RC phase.

>
> > It is an interesting idea, but I think it is too complex.
> > Fixing all packages in the PyPI would be a better approach.
>
> It's not the first time that we have to take such decision. "Fixing
> all PyPI packages" is not possible. Python core developers are limited
> are so we can only port a very low number of packages. Breaking
> packages on purpose force developers to upgrade their code base, it
> should work better than deprecation warnings. But it is likely to make
> some people unhappy.
>

OK, My terminology was wrong.  Not all, but almost of living packages.

* This change doesn't affect to pure Python packages.
* Most of the rest uses Cython.  Since I already report an issue to Cython,
  regenerating with new Cython release fixes them.
* Most of the rest support PEP 393 already.

So I expect only few percents of active packages will be affected.

This is a list of use of deprecated APIs from the top 4000 packages,
except PyArg_ParseTuple(AndKeywords).
Files generated by Cython are excluded.  But most of them are false
positives yet (e.g. in `#if PY2`).
https://github.com/methane/notes/blob/master/2020/wchar-cache/deprecated-use

I have filed some issues and sent some pull requests already after I
created this thread.

> Having a separated hash table would prevent to break many PyPI
> packages by continuing to provide the backward compatibility. We can
> even consider to disable it by default, but provide a temporary option
> to opt-in for backward compatibility. For example, "python3.10 -X
> unicode_compat".
>
> I proposed sys.set_python_compat_version(version) in the rejected PEP
> 606, but this PEP was too broad:
> https://www.python.org/dev/peps/pep-0606/
>
> The question is if it's worth it to pay the maintenance burden on the
> Python side, or to drop backward compatibility if it's "too
> expensive".
>
> I understood that your first motivation is to reduce PyASCIObject
> structure size. Using a hash table, the overhead would only be paid by
> users of the deprecated functions. But it requires to keep the code
> and so continue to maintain it. Maybe I missed some drawbacks.
>

Memory usage is the most important motivation.  But runtime cost of
PyUnicode_READY and maintenance cost of legacy unicode matters too.

I will reconsider your idea.  But I still feel that helping many third
parties is the most constructive way.

Regards,

-- 
Inada Naoki  
___
Python-Dev mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/[email protected]/message/KBBR2KQPNKSPQIPR5UKW2ALM3QGNDBEU/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Should we be making so many changes in pursuit of PEP 554?

2020-06-16 Thread Guido van Rossum
Has anybody brought up the problem yet that if one subinterpreter
encounters a hard crash (say, it segfaults due to a bug in a C extension
module), all subinterpreters active at that moment in the same process are
likely to lose all their outstanding work, without a chance of recovery?

(Of course once we have locks in shared memory, a crashed process leaving a
lock behind may also screw up everybody else, though perhaps less severely.)

-- 
--Guido van Rossum (python.org/~guido)
*Pronouns: he/him **(why is my pronoun here?)*

___
Python-Dev mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/[email protected]/message/AMX6KO7GGGAAHTVRP34OMUA7ROCDHKSM/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Should we be making so many changes in pursuit of PEP 554?

2020-06-16 Thread Edwin


That's so, but threads have this problem too.  I don't think this discussion is 
about finding a "perfect" solution or an "ultimate" way of doing things, rather 
it is about the varying opinions on certain design tradeoffs.  If I'm satisfied 
that subinterpreters are the correct solution to my particular need, why 
shouldn't I have the privilege of doing so?



--Edwin

- Original Message -
From: Guido van Rossum ([email protected])
Date: 06/16/20 13:30
To: Python Dev ([email protected])
Subject: [Python-Dev] Re: Should we be making so many changes in pursuit of PEP 
554?


Has anybody brought up the problem yet that if one subinterpreter encounters a 
hard crash (say, it segfaults due to a bug in a C extension module), all 
subinterpreters active at that moment in the same process are likely to lose 
all their outstanding work, without a chance of recovery?

(Of course once we have locks in shared memory, a crashed process leaving a 
lock behind may also screw up everybody else, though perhaps less severely.)
--


--Guido van Rossum (python.org/~guido)
Pronouns: he/him (why is my pronoun here?)

___
Python-Dev mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/[email protected]/message/AMX6KO7GGGAAHTVRP34OMUA7ROCDHKSM/
Code of Conduct: http://python.org/psf/codeofconduct/

___
Python-Dev mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/[email protected]/message/BN2KKUVRMPBZGHLAGUZK5TCOBYTYBMVV/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Improving inspect capabilities for classes

2020-06-16 Thread Guido van Rossum
On Tue, Jun 16, 2020 at 2:00 AM Thomas Viehmann  wrote:

> Hello,
>
> thank you for your feedback!
> > I could think of a trick that inspect.getsource() might use if the class
> > contains at least one method: it could look at a method and try its
> > `__code__.co_filename` attribute (maybe only if the `__file__` attribute
> > for the module found via the class's `__module__` doesn't exist -- I'm
> sure
> > Jupyter can arrange for that to be the case). But I see how that would
> be a
> > problem (I can think of plenty of reasons why a class might not have any
> > methods).
>
> That is a great idea, in addition getsource would have to filter out
> inherited methods (which should be doable, but indeed exclude classes).
>

It would just have to iterate over the class `__dict__`, which doesn't
contain inherited objects anyways.


> Would you prefer such a patch to inspect.getsource over the adding
> __filename__?
>

It would certainly be much easier to get through the review process. Adding
a `__filename__` (why not `__file__`?) attribute to classes is a major
surgery, presumably requiring a PEP, and debating the pros and cons and
performance implications and fixing a bunch of tests that don't expect this
attribute, and so on. Adding an imperfect solution to inspect.getsource()
would only require the cooperation of whoever maintains the inspect module.


> > I do think that your proposal is reasonable, although I wonder what the
> > Jupyter developers think of it. (How closely are you connected to that
> > project?)
>
> I am not affiliated with Jupyter at all and I imagine that I'd be prone
> to asking "would it be nice if inspect.getsource worked better?", which
> likely doesn't yield the most interesting answers.
>

I had trouble parsing this sentence; I believe you mean to say that the
Jupyter maintainers would just tell you this should be fixed in
inspect.getsource()?


> We can chase the related reports:
>
> https://github.com/jupyter/notebook/issues/3802
>

This was closed because Jupyter is not to blame here at all, they declared
it an IPython issue.


> https://github.com/ipython/ipython/issues/11249
>

And here there also doesn't seem to be much interest, given that it's been
open and unanswered since 2018.

The topic does seem to pop up now and then:
>
>
> https://stackoverflow.com/questions/51566497/getting-the-source-of-an-object-defined-in-a-jupyter-notebook
>
> https://stackoverflow.com/questions/35854373/python-get-source-code-of-class-using-inspect
>

Very few stars. This suggests not many people care about this problem, and
that in turn might explain the lukewarm response you find everywhere.

Lastly, I have to ask: Why is this so important to you? What does this
prevent you from doing? You have illustrated the problem with toy examples
-- but what is the real-world problem you're encountering (apparently
regularly) that causes you to keep pushing on this? This needs to be
explored especially since so few other people appear to need this to work.

-- 
--Guido van Rossum (python.org/~guido)
*Pronouns: he/him **(why is my pronoun here?)*

___
Python-Dev mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/[email protected]/message/Y6BZXEVK6QDSWC4ILKR2ZFCCOKZ3GXTM/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Should we be making so many changes in pursuit of PEP 554?

2020-06-16 Thread Guido van Rossum
On Tue, Jun 16, 2020 at 10:52 AM Edwin  wrote:

>
> That's so, but threads have this problem too.  I don't think this
> discussion is about finding a "perfect" solution or an "ultimate" way of
> doing things, rather it is about the varying opinions on certain design
> tradeoffs.  If I'm satisfied that subinterpreters are the correct solution
> to my particular need, why shouldn't I have the privilege of doing so?
>

Interesting choice of word. This is open source, no feature is free, you
are not entitled to anything in particular.

-- 
--Guido van Rossum (python.org/~guido)
*Pronouns: he/him **(why is my pronoun here?)*

___
Python-Dev mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/[email protected]/message/5CRKAFLGWUVLX7U2ZSCQYJCWMJR635RX/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Should we be making so many changes in pursuit of PEP 554?

2020-06-16 Thread Mark Shannon


On 16/06/2020 1:24 pm, Nick Coghlan wrote:
Multiprocessing serialisation overheads are abysmal. With enough OS 
support you can attempt to mitigate that via shared memory mechanisms 
(which Davin added to the standard library), but it's impossible to get 
the overhead of doing that as low as actually using the address space of 
one OS process.


What does "multiprocessing serialisation" even mean? I assume you mean 
the overhead of serializing objects for communication between processes.


The cost of serializing an object has absolutely nothing to do with 
which process the interpreter is running in.


Separate interpreters within a single process will still need to 
serialize objects for communication.


The overhead of passing data through shared memory is the same for 
threads and processes. It's just memory.


Can we please stick to facts and not throw around terms like "abysmal" 
with no data whatsoever to back it up.




For the rest of the email... multiprocessing isn't going anywhere.

Within-process parallelism is just aiming to provide another trade-off 
point in design space for CPU bound workloads (one roughly comparable to 
the point where JS web workers sit).


Cheers,
Nick.

On Sat., 6 Jun. 2020, 12:39 am Mark Shannon, > wrote:


Hi,

There have been a lot of changes both to the C API and to internal
implementations to allow multiple interpreters in a single O/S process.

These changes cause backwards compatibility changes, have a negative
performance impact, and cause a lot of churn.

While I'm in favour of PEP 554, or some similar model for
parallelism in
Python, I am opposed to the changes we are currently making to
support it.


What are sub-interpreters?
--

A sub-interpreter is a logically independent Python process which
supports inter-interpreter communication built on shared memory and
channels. Passing of Python objects is supported, but only by copying,
not by reference. Data can be shared via buffers.


How can they be implemented to support parallelism?
---

There are two obvious options.
a) Many sub-interpreters in a single O/S process. I will call this the
many-to-one model (many interpreters in one O/S process).
b) One sub-interpreter per O/S process. This is what we currently have
for multiprocessing. I will call this the one-to-one model (one
interpreter in one O/S process).

There seems to be an assumption amongst those working on PEP 554 that
the many-to-one model is the only way to support sub-interpreters that
can execute in parallel.
This isn't true. The one-to-one model has many advantages.


Advantages of the one-to-one model
--

1. It's less bug prone. It is much easier to reason about code working
in a single address space. Most code assumes

2. It's more secure. Separate O/S processes provide a much stronger
boundary between interpreters. This is why some browsers use separate
processes for browser tabs.

3. It can be implemented on top of the multiprocessing module, for
testing. A more efficient implementation can be developed once
sub-interpreters prove useful.

4. The required changes should have no negative performance impact.

5. Third party modules should continue to work as they do now.

6. It takes much less work :)


Performance
---

Creating O/S processes is usually considered to be slow. Whilst
processes are undoubtedly slower to create than threads, the absolute
time to create a process is small; well under 1ms on linux.

Creating a new sub-interpreter typically requires importing quite a few
modules before any useful work can be done.
The time spent doing these imports will dominate the time to create an
O/S process or thread.

If sub-interpreters are to be used for parallelism, there is no need to
have many more sub-interpreters than CPU cores, so the overhead should
be small. For additional concurrency, threads or coroutines can be used.

The one-to-one model is faster as it uses the hardware for interpreter
separation, whereas the many-to-one model must use software.
Process separation by the hardware virtual memory system has zero cost.
Separation done in software needs extra memory reads when doing
allocation or deallocation.

Overall, for any interpreter that runs for a second or more, it is
likely that the one-to-one model would be faster.


Timings of multiprocessing & threads on my machine (6-core 2019 laptop)
---

#Threads

def foo():
      pass

def spawn_and_join(count):
      threads = [ Thread(target=foo, args=()) for _ in range(count) ]
      for t in threads:
       

[Python-Dev] Re: When can we remove wchar_t* cache from string?

2020-06-16 Thread Brett Cannon
Inada Naoki wrote:
> On Tue, Jun 16, 2020 at 9:30 PM Victor Stinner [email protected] wrote:
> >
> > Le mar. 16 juin 2020 à 10:42, Inada Naoki [email protected] a écrit :
> > Hmm,  Is there any chance to add
> > DeprecationWarning in Python 3.9?
> > In my experience, more and more projects are running their test suite
> > with -Werror, which is a good thing. Introducing a new warning is
> > likely to "break" many of these projects. For example, in Fedora, we
> > run the test suite when we build a package. If a test fails, the
> > package build fails and we have to decide to either ignore the failing
> > tests (not good) or find a solution to repair the tests (update the
> > code base to new C API functions).
> > But Python 3.9 is still in beta phase and we have enough time to get 
> > feedback.
> If the new warning is unacceptable breakage, we can remove it in RC phase.

Sure, but it's also a bit disruptive to throw in new warnings in the middle of 
the beta cycle versus removing them. Typically we try to improve compatibility 
for people in betas, not lower it. Is it that important to get it done in 3.9 
versus making the change in the master branch right now and just waiting 12 
extra months?

In the end, though, it's the release manager's decision.

-Brett

> >
> > It is an
> > interesting idea, but I think it is too complex.
> > Fixing all packages in the PyPI would be a better approach.
> > It's not the first time that we have to take such decision. "Fixing
> > all PyPI packages" is not possible. Python core developers are limited
> > are so we can only port a very low number of packages. Breaking
> > packages on purpose force developers to upgrade their code base, it
> > should work better than deprecation warnings. But it is likely to make
> > some people unhappy.
> > OK, My terminology was wrong.  Not all, but almost of living packages.
> 
> This change doesn't affect to pure Python packages.
> Most of the rest uses Cython.  Since I already report an issue to Cython,
> regenerating with new Cython release fixes them.
> Most of the rest support PEP 393 already.
> 
> So I expect only few percents of active packages will be affected.
> This is a list of use of deprecated APIs from the top 4000 packages,
> except PyArg_ParseTuple(AndKeywords).
> Files generated by Cython are excluded.  But most of them are false
> positives yet (e.g. in #if PY2).
> https://github.com/methane/notes/blob/master/2020/wchar-cache/deprecated-use
> I have filed some issues and sent some pull requests already after I
> created this thread.
> > Having a separated hash table would prevent to break
> > many PyPI
> > packages by continuing to provide the backward compatibility. We can
> > even consider to disable it by default, but provide a temporary option
> > to opt-in for backward compatibility. For example, "python3.10 -X
> > unicode_compat".
> > I proposed sys.set_python_compat_version(version) in the rejected PEP
> > 606, but this PEP was too broad:
> > https://www.python.org/dev/peps/pep-0606/
> > The question is if it's worth it to pay the maintenance burden on the
> > Python side, or to drop backward compatibility if it's "too
> > expensive".
> > I understood that your first motivation is to reduce PyASCIObject
> > structure size. Using a hash table, the overhead would only be paid by
> > users of the deprecated functions. But it requires to keep the code
> > and so continue to maintain it. Maybe I missed some drawbacks.
> > Memory usage is the most important motivation.  But runtime cost of
> PyUnicode_READY and maintenance cost of legacy unicode matters too.
> I will reconsider your idea.  But I still feel that helping many third
> parties is the most constructive way.
> Regards,
___
Python-Dev mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/[email protected]/message/RF2VNXBKCMGHBJVPMQ5B7XWGAHBRGL44/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: When can we remove wchar_t* cache from string?

2020-06-16 Thread Steve Dower

On 16Jun2020 1641, Inada Naoki wrote:

* This change doesn't affect to pure Python packages.
* Most of the rest uses Cython.  Since I already report an issue to Cython,
   regenerating with new Cython release fixes them.


The precedent set in our last release with tp_print was that 
regenerating Cython releases was too much to ask.


Unless we're going to overrule that immediately, we should leave 
everything there and give users/developers a full release cycle with 
updated Cython version to make new releases without causing any breakage.


Cheers,
Steve
___
Python-Dev mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/[email protected]/message/KJV2FT367LV62WO4A3VXTRCYNMSIF53K/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Should we be making so many changes in pursuit of PEP 554?

2020-06-16 Thread Brett Cannon
I wanted to let people know that the four of us on the SC not driving this work 
-- i.e. everyone but Victor -- talked about this at our last meeting and we 
support the work to isolate interpreter state from being global. There are 
benefits for the situation where you have to integrate CPython with other code 
which does its own thread management (i.e. the embedded scenario). It also 
helps from an organizational perspective of the code and thus we believe leads 
to easier maintainability long-term. We are okay with the performance trade-off 
required for this work.

I will also say that while this work is a prerequisite for PEP 554 as currently 
proposed, it does not mean the SC believes PEP 554 will ultimately be accepted. 
We view this work as independently motivated from PEP 554.
___
Python-Dev mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/[email protected]/message/YRDOIQ5UOXLUDK7EXCBZYBBXHJDIXG3W/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: When can we remove wchar_t* cache from string?

2020-06-16 Thread Inada Naoki
On Wed, Jun 17, 2020 at 4:16 AM Steve Dower  wrote:
>
> On 16Jun2020 1641, Inada Naoki wrote:
> > * This change doesn't affect to pure Python packages.
> > * Most of the rest uses Cython.  Since I already report an issue to Cython,
> >regenerating with new Cython release fixes them.
>
> The precedent set in our last release with tp_print was that
> regenerating Cython releases was too much to ask.
>
> Unless we're going to overrule that immediately, we should leave
> everything there and give users/developers a full release cycle with
> updated Cython version to make new releases without causing any breakage.
>

We have one year for 3.10 and two years for 3.11.

Additionally, unlike the case of tp_print, we don't need to wait all
of them are regenerated.
Cython used deprecated APIs in two cases:

* Cython used PyUnicode_FromUnicode(NULL, 0) to create empty string.
Many packages
  are affected.  But we can keep it working when we removed wstr.
  https://github.com/cython/cython/pull/3677

* Cython used PyUnicode_FromUnicode() in very minor cases.  Only few
packages are affected.
  https://github.com/cython/cython/issues/3678

So we need to ask to regenerate with Cython >= 0.9.21 only a few projects.

Regards,
-- 
Inada Naoki  
___
Python-Dev mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/[email protected]/message/XGLIVTXIGDEPE5UH32ZGGUCXXQVYXYQN/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Accepting PEP 618: zip(strict=True)

2020-06-16 Thread Guido van Rossum
After taking a break to recapitulate from the vigorous debate, Brandt
Bucher has revised PEP 618  and
submitted it for review . I
volunteered to be PEP-Delegate (the new term for the outdated
BDFL-Delegate) and the SC has approved

me for this role. (Note that Antoine, the PEP's sponsor, declined to be the
lightning rod, er, PEP-Delegate.)

I have now reviewed the PEP and skimmed some of the more recent discussion
about the topic. It is clear that no solution will win everyone over. But
most seem to agree that offering *some* solution for the stated problem is
better than none.

To spare us more heartache, I am hereby accepting PEP 618. I expect that
the implementation  will land
soon.

I have two very minor editorial remarks, which Brandt may address at his
leisure:

- The "Backward Compatibility" section could be beefed up slightly, e.g. by
pointing out that the default remains strict=False and that zip previously
did not take keyword arguments at all.

- The error messages are somewhat odd: why is the error sometimes that one
iterator is too long, and other times that one iterator is too short? All
we really know is that not all iterators have the same length, but the
current phrasing seems to be assuming that the first iterator is never too
short or too long.

Congratulations, Brandt!

-- 
--Guido van Rossum (python.org/~guido)
*Pronouns: he/him **(why is my pronoun here?)*

___
Python-Dev mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/[email protected]/message/NLWB7FVJGMBBMCF4P3ZKUIE53JPDOWJ3/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Accepting PEP 618: zip(strict=True)

2020-06-16 Thread Brandt Bucher
Woo! Many thanks to Ram for the idea, Antoine for sponsoring, Guido for 
PEP-Delegating, and everyone on -Ideas and -Dev for the spirited discussion and 
review.

Brandt
___
Python-Dev mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/[email protected]/message/F6M3F2HZ5DMBPLERGACGKMUGVNRUMNP6/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: type() does not call __prepare__?

2020-06-16 Thread Nick Coghlan
It's not a bug, it's just bypassing steps in the Py3 dynamic class
definition process.

https://docs.python.org/3/library/types.html#types.new_class is the API to
use to invoke the full metaclass machinery, including namespace preparation.

https://docs.python.org/3/reference/datamodel.html#metaclasses goes into
more detail on how that works (calling the metaclass constructor is the
last step, after the namespace has already been prepared and populated).

Cheers,
Nick.

On Tue., 2 Jun. 2020, 5:22 am Ethan Furman,  wrote:

>  From stackoverflow [1]
>
># metaprepare.py
>class Meta1(type):
>@classmethod
>def __prepare__(mcs, name, bases):
>print('call prepare')
>return {}
>def __new__(mcs, name, bases, parameters):
>return super().__new__(mcs, name, bases, parameters)
>
>class A(metaclass=Meta1):
>pass
>
>type('C', (A, ), {})
>
> output is:
>
>call prepare
>
> (just the once, not twice)
>
> The behavior of `type()` not calling `__prepare__()` has been constant
> since 3.3.
>
> Is it a bug?
>
>
> --
> ~Ethan~
>
>
> [1] https://stackoverflow.com/q/62128254/208880
> ___
> Python-Dev mailing list -- [email protected]
> To unsubscribe send an email to [email protected]
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at
> https://mail.python.org/archives/list/[email protected]/message/CQTD7WFOIDBF5PSD77AALBTWJQ67UPM5/
> Code of Conduct: http://python.org/psf/codeofconduct/
>
___
Python-Dev mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/[email protected]/message/SMBMNS6H76BJ4SQBDY7XV5YWQFYSCBWK/
Code of Conduct: http://python.org/psf/codeofconduct/