date:20210922



Marc-Andre Lemburg  added the comment:

Such a change would be backwards incompatible and no longer in line with PEP 
249.

I also don't understand what you regard as confusing about the message "unable 
to open database file". The message could be extended to include the path, but 
apart from that, it's as clear as it can get :-)

--
nosy: +lemburg
status: pending -> open

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue24076] sum() several times slower on Python 3 64-bit

2021-09-22 Thread Stefan Behnel



Stefan Behnel  added the comment:

Sorry for that, Pablo. I knew exactly where the problem was, the second I read 
your notification. Thank you for resolving it so quickly.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue41203] Replace references to OS X in documentation with macOS

2021-09-22 Thread Serhiy Storchaka



Change by Serhiy Storchaka :


--
pull_requests: +26907
pull_request: https://github.com/python/cpython/pull/28515

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue45261] Unreliable (?) results from timeit (cache issue?)

2021-09-22 Thread Tim Holy



New submission from Tim Holy :

This is a lightly-edited reposting of 
https://stackoverflow.com/questions/69164027/unreliable-results-from-timeit-cache-issue

I am interested in timing certain operations in numpy and skimage, but I'm 
occasionally seeing surprising transitions (not entirely reproducible) in the 
reported times. Briefly, sometimes timeit returns results that differ by about 
5-fold from earlier runs. Here's the setup:

import skimage
import numpy as np
import timeit

nrep = 16

def run_complement(img):
def inner():
skimage.util.invert(img)
return inner

img = np.random.randint(0, 65535, (512, 512, 3), dtype='uint16')

and here's an example session:

In [1]: %run python_timing_bug.py

In [2]: t = timeit.Timer(run_complement(img))

In [3]: t.repeat(nrep, number=1)
Out[3]: 
[0.0024439050030196086,
 0.0020311699918238446,
 0.00033007100864779204,
 0.0002889479947043583,
 0.0002851780009223148,
 0.0002851030003512278,
 0.00028487699455581605,
 0.00032116699730977416,
 0.00030912700458429754,
 0.0002877369988709688,
 0.0002840430097421631,
 0.00028515000303741544,
 0.00030791999597568065,
 0.00029302599432412535,
 0.00030723700183443725,
 0.0002916679950430989]

In [4]: t = timeit.Timer(run_complement(img))

In [5]: t.repeat(nrep, number=1)
Out[5]: 
[0.0006320849934127182,
 0.0004014919977635145,
 0.00030359599622897804,
 0.00029224599711596966,
 0.0002907510061049834,
 0.0002920039987657219,
 0.0002918920072261244,
 0.0003095199936069548,
 0.00029789700056426227,
 0.0002885590074583888,
 0.00040198900387622416,
 0.00037131100543774664,
 0.00040271600300911814,
 0.0003492849937174469,
 0.0003378120018169284,
 0.00029762100894004107]

In [6]: t = timeit.Timer(run_complement(img))

In [7]: t.repeat(nrep, number=1)
Out[7]: 
[0.00026428700948599726,
 0.00012682100350502878,
 7.380900206044316e-05,
 6.346100417431444e-05,
 6.29679998382926e-05,
 6.278700311668217e-05,
 6.320899410638958e-05,
 6.25409884378314e-05,
 6.262199894990772e-05,
 6.247499550227076e-05,
 6.293901242315769e-05,
 6.259800284169614e-05,
 6.285199197009206e-05,
 6.293600017670542e-05,
 6.309800664894283e-05,
 6.248900899663568e-05]

Notice that in the final run, the minimum times were on the order of 0.6e-4 vs 
the previous minimum of ~3e-4, about 5x smaller than the times measured in 
previous runs. It's not entirely predictable when this "kicks in."

The faster time corresponds to 0.08ns/element of the array, which given that 
the 2.6GHz clock on my i7-8850H CPU ticks every ~0.4ns, seems to be pushing the 
limits of credibility (though thanks to SIMD on my AVX2 CPU, this cannot be 
entirely ruled out). My understanding is that this operation is implemented as 
a subtraction and most likely gets reduced to a bitwise-not by the compiler. So 
you do indeed expect this to be fast, but it's not entirely certain it should 
be this fast, and in either event the non-reproducibility is problematic.

It may be relevant to note that the total amount of data is

In [15]: img.size * 2
Out[15]: 1572864

and lshw reports that I have 384KiB L1 cache and 1536KiB of L2 cache:

In [16]: 384*1024
Out[16]: 393216

In [17]: 1536*1024
Out[17]: 1572864

So it seems possible that this result is being influenced by just fitting in L2 
cache. (Note that even in the "fast block," the first run was not fast.) If I 
increase the size of the image:

 img = np.random.randint(0, 65535, (2048, 2048, 3), dtype='uint16')

then my results seem more reproducible in the sense that I have not yet seen 
one of these transitions.

--
components: Library (Lib)
messages: 402411
nosy: timholy
priority: normal
severity: normal
status: open
title: Unreliable (?) results from timeit (cache issue?)
type: behavior
versions: Python 3.8

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue45262] crash if asyncio is used before and after re-initialization if using python embedded in an application

2021-09-22 Thread Benjamin Schiller



New submission from Benjamin Schiller :

We have embedded Python in our application and we deinitialize/initialize the 
interpreter at some point of time. If a simple script with a thread that sleeps 
with asyncio.sleep is loaded before and after the re-initialization, then we 
get the following assertion in the second run of the python module:

"Assertion failed: Py_IS_TYPE(rl, &PyRunningLoopHolder_Type), file 
D:\a\1\s\Modules_asynciomodule.c, line 261"

Example to reproduce this crash: 
https://github.com/benjamin-sch/asyncio_crash_in_second_run

--
components: asyncio
messages: 402412
nosy: asvetlov, benjamin-sch, yselivanov
priority: normal
severity: normal
status: open
title: crash if asyncio is used before and after re-initialization if using 
python embedded in an application
type: crash
versions: Python 3.9

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue45061] [C API] Detect refcount bugs on True/False in C extensions



Change by STINNER Victor :


--
pull_requests: +26908
pull_request: https://github.com/python/cpython/pull/28516

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue45263] round displays 2 instead of 3 digits

2021-09-22 Thread Kenneth Fossen



New submission from Kenneth Fossen :

When round is given 3 as argument for number of decimal points, the expected 
behaviour is to return a digit with 3 decimal points

Example:

ig1 = 0.4199730940219749
ig2 = 0.4189730940219749
print(round(ig1, 3)) # 0.42 expected  to be 0.420
print(round(ig2, 3)) # 0.419

--
messages: 402413
nosy: kenfos
priority: normal
severity: normal
status: open
title: round displays 2 instead of 3 digits
type: behavior
versions: Python 3.9

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue45261] Unreliable (?) results from timeit (cache issue?)



Steven D'Aprano  added the comment:

Running timeit with number=1 for fast running processes is not likely to be 
reliable. The problem is that modern computers are *incredibly noisy*, there 
are typically hundreds of processes running on them at any one time.

Trying to get deterministic times from something that runs in 0.3 ms or less is 
not easy. You can see yourself that the first couple of runs are of the order 
of 2-10 times higher before settling down. And as you point out, then there is 
a further drop by a factor of 5.

I agree that a CPU cache kicking in is a likely explanation.

When I use timeit, I generally try to have large enough number that the time is 
at least 0.1s per loop. To be perfectly honest, I don't know if that is 
actually helpful or if it is just superstition, but using that as a guide, I've 
never seen the sort of large drop in timing results that you are getting.

I presume you have read the notes in the doc about the default time?

"default_timer() measurements can be affected by other programs running on the 
same machine"

https://docs.python.org/3/library/timeit.html

There are more comments about timing in the source code and the commentary by 
Tim Peters in the Cookbook.

Another alternative is to try Victor Stinner's pyperf tool. I don't know how it 
works though.

--
nosy: +steven.daprano, tim.peters, vstinner

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue43760] The DISPATCH() macro is not as efficient as it could be (move PyThreadState.use_tracing)



Pablo Galindo Salgado  added the comment:

I'm removing the release blocker as per above, feel free to close of there is 
nothing else to discuss or act on here.

--
priority: release blocker -> 

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue43760] The DISPATCH() macro is not as efficient as it could be (move PyThreadState.use_tracing)



Pablo Galindo Salgado  added the comment:

I discussed this particular instance with the Steering Council and the 
conclusion was that this field (use_tracing) is considered an implementation 
detail and therefore its removal it's justified so we won't be restoring it.

I'm therefore closing PR28498

Notice that this decision only affects this particular issue and should not be 
generalized to other fields or structures. We will try to determine and open a 
discusion in the future about what is considered public/private in these 
ambiguous cases and what can users expect regarding stability and backwards 
compatibility.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue45061] [C API] Detect refcount bugs on True/False in C extensions



STINNER Victor  added the comment:


New changeset 8620be99da930230b18ec05f4d7446ee403531af by Victor Stinner in 
branch 'main':
bpo-45061: Revert unicode_is_singleton() change (GH-28516)
https://github.com/python/cpython/commit/8620be99da930230b18ec05f4d7446ee403531af


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue45263] round displays 2 instead of 3 digits



Steven D'Aprano  added the comment:

"the expected behaviour is to return a digit with 3 decimal points"

I presume you mean a float with 3 digits after the decimal point. Three decimal 
points would be a bug :-)

I am afraid that you are misinterpreting what you are seeing. Rounding a number 
does not force it to display with trailing zeroes. Floats display using the 
most natural form, which means trailing zeroes are dropped. 0.42 will display 
as 0.42 not 0.420 or 0.4200 or 0.42.

The number has no idea that you want to display three digits, and cannot know. 
There is no way for each number to remember that at some point it came from you 
calling round().

If you need to format the number for display to a fixed size, don't use round, 
use one of the many different string methods:

>>> num = 0.42
>>> print("x = %.4f" % num)
x = 0.4200
>>> print("x = {:.8f}".format(num))
x = 0.4200
>>> print(f"x = {num:.12f}")
x = 0.4200


This is for reporting bugs, not a help desk. In the future, it is best to check 
with more experienced programmers before reporting things as a bug. There are 
many useful forums such as the Python Discuss website, Reddit's r/learnpython, 
or the Python-List mailing list where people will be happy to help guide you 
whether you have discovered a bug or not.

https://www.python.org/community/forums/

This is *especially* important when it comes to numeric issues which can be 
tricky even for experienced coders.

--
nosy: +steven.daprano
resolution:  -> not a bug
stage:  -> resolved
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue45261] Unreliable (?) results from timeit (cache issue?)

STINNER Victor added the comment:

PyPy emits a warning when the timeit module is used, suggesting to use pyperf.

timeit uses the minimum, whereas pyperf uses the average (arithmetic mean).

timeit uses a single process, pyperf spawns 21 processes: 1 just for the loop
calibration, 20 to compute values.

timeit computes 5 values, pyperf computes 60 values.

timeit uses all computed values, pyperf ignores the first value considered as a
"warmup value" (the number of warmup values can be configured).

timeit doesn't compute the standard deviation, pyperf does. The standard
deviation gives an idea if the benchmark looks reliable or not. IMO results
without standard deviation should not be trusted.

pyperf also emits warning when a benchmark doesn't look reliable. For example,
if the user ran various workload while the benchmark was running.

pyperf also supports storing results in a JSON file which stores all values,
but also metadata.

I cannot force people to stop using timeit. But there are reason why pyperf is
more reliable than timeit.

Benchmarking is hard. See pyperf documentation giving hints how to get
reproducible benchmark results:
https://pyperf.readthedocs.io/en/latest/run_benchmark.html#how-to-get-reproducible-benchmark-results

Read also this important article ;-)
"Biased Benchmarks (honesty is hard)"
http://matthewrocklin.com/blog/work/2017/03/09/biased-benchmarks

___
Python tracker

___
___
Python-bugs-list mailing list
Unsubscribe:
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue45263] round displays 2 instead of 3 digits



Steven D'Aprano  added the comment:

Oh, I forgot:

Python mailing lists and IRC:

https://www.python.org/community/lists/
https://www.python.org/community/irc/

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue24076] sum() several times slower on Python 3 64-bit



Pablo Galindo Salgado  added the comment:

Always happy to help :)

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue45261] Unreliable (?) results from timeit (cache issue?)


Steven D'Aprano  added the comment:

Thanks Victor for the explanation about pyperf's addition features. They 
do sound very useful. Perhaps we should consider adding some of them to 
timeit?

However, in my opinion using the average is statistically wrong. Using 
the mean is good when errors are two-sided, that is, your measured value 
can be either too low or too high compared to the measured value:

measurement = true value ± random error

If the random errors are symmetrically distributed, then taking the 
average tends to cancel them out and give you a better estimate of the 
true value. Even if the errors aren't symmetrical, the mean will still 
be a better estimate of the true value. (Or perhaps a trimmed mean, or 
the median, if there are a lot of outliers.)

But timing results are not like that, the measurement errors are 
one-sided, not two:

measurement = true value + random error

So by taking the average, all you are doing is averaging the errors, not 
cancelling them. The result you get is *worse* as an estimate of the 
true value than the minimum.

All those other factors (ignore the warmup, check for a small stdev, 
etc) seem good to me. But the minimum, not the mean, is still going to 
be closer to the true cost of running the code.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue44958] [sqlite3] only reset statements when needed

2021-09-22 Thread Ken Jin



Ken Jin  added the comment:

Erlend, I suspect that 050d1035957379d70e8601e6f5636637716a264b may have 
introduced a perf regression in pyperformance's sqlite_synth benchmark:
https://speed.python.org/timeline/?exe=12&base=&ben=sqlite_synth&env=1&revs=50&equid=off&quarts=on&extr=on
The benchmark code is here 
https://github.com/python/pyperformance/blob/main/pyperformance/benchmarks/bm_sqlite_synth.py.

--
nosy: +kj

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue44958] [sqlite3] only reset statements when needed

2021-09-22 Thread Erlend E. Aasland



Erlend E. Aasland  added the comment:

Ouch, that's quite a regression! Thanks for the heads up! I'll have a look at 
it right away.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue44958] [sqlite3] only reset statements when needed

2021-09-22 Thread Erlend E. Aasland



Erlend E. Aasland  added the comment:

I'm unable to reproduce this regression on my machine (macOS, debug build, no 
optimisations). Are you able to reproduce, Ken?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue45256] Remove the usage of the cstack in Python to Python calls

2021-09-22 Thread Dong-hee Na



Change by Dong-hee Na :


--
nosy: +corona10

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue43760] The DISPATCH() macro is not as efficient as it could be (move PyThreadState.use_tracing)

2021-09-22 Thread Petr Viktorin



Petr Viktorin  added the comment:

> The ABI is not broken, the only thing that this PR change is the size of the 
> struct. All the offsets to the members are the same and therefore will be 
> valid in any compiled code.

I'll just note that a change in struct size does technically break ABI, since 
*arrays* of PyThreadState will break.

So the size shouldn't be changed in RCs or point releases. (But since it's not 
part of stable ABI, it was OK to change it for 3.10.)

> We will try to determine and open a discussion in the future about what is 
> considered public/private in these ambiguous cases and what can users expect 
> regarding stability and backwards compatibility.

Please keep me in the loop; I'm working on summarizing my understanding of this 
(in a form that can be added to the docs if approved).

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue44958] [sqlite3] only reset statements when needed

2021-09-22 Thread Erlend E. Aasland



Erlend E. Aasland  added the comment:

> I'm unable to reproduce this regression on my machine (macOS, debug build, no 
> optimisations) [...]

Correction: I _am_ able to reproduce this.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue44598] test_constructor (test.test_ssl.ContextTests) ... Fatal Python error: Segmentation fault

2021-09-22 Thread tongxiaoge



tongxiaoge  added the comment:

I installed OpenSSL version 1.1.1l and tested it again. The problem 
disappeared. It should be the reason why the OpenSSL version I used before is 
too low. The current issue is closed

--
stage:  -> resolved
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue45261] Unreliable (?) results from timeit (cache issue?)

2021-09-22 Thread Dong-hee Na



Change by Dong-hee Na :


--
nosy: +corona10

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue45256] Remove the usage of the cstack in Python to Python calls

2021-09-22 Thread Serhiy Storchaka



Serhiy Storchaka  added the comment:

AFAIK Mark Shannon proposed this idea, but it was rejected.

--
nosy: +gvanrossum, serhiy.storchaka

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue45256] Remove the usage of the cstack in Python to Python calls



Pablo Galindo Salgado  added the comment:

What was rejected was https://www.python.org/dev/peps/pep-0651/ which included 
this idea but had a lot more stuff in it. In particular, it was rejected 
because it gave semantics to overflow exceptions (two exceptions were 
proposed), new APIs and it had lack consistent guarantees for different 
platforms, among other considerations.

This version is just for optimization purposes with no changes in semantics.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue45213] Frozen modules are looked up using a linear search.

2021-09-22 Thread Dong-hee Na



Dong-hee Na  added the comment:

I thought about the Trie implementation for this case.
But as Eric said, it can be overkilling for this time.

--
nosy: +corona10

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue43760] The DISPATCH() macro is not as efficient as it could be (move PyThreadState.use_tracing)



Pablo Galindo Salgado  added the comment:

> I'll just note that a change in struct size does technically break ABI, since 
> *arrays* of PyThreadState will break.

Not that matters now because we are not proceeding but just to clarify why I 
deemed this acceptable: arrays of PyThreadState is extremelly unlikely in 
extensions because we pass it by Pointer and is always manipulated by pointer. 
To place it in an array you either need to create one or copy one into an 
array, which I cannot see what would be the point because the fields are mainly 
pointers that would become useless as the interpreter will not update anything

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue45213] Frozen modules are looked up using a linear search.



Marc-Andre Lemburg  added the comment:

Perhaps a frozen dict could be used instead of the linear search.

This could then also be made available as sys.frozen_modules for inspection by 
applications and tools such as debuggers or introspection tools trying to find 
source code (and potentially failing at this).

Not urgent, though.

--
nosy: +lemburg

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue43760] The DISPATCH() macro is not as efficient as it could be (move PyThreadState.use_tracing)



Pablo Galindo Salgado  added the comment:

Also, I checked the DWARF tree of all existing wheels for 3.10 on PyPI (there 
aren't many) and none had anything that uses the size of the struct.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue41137] pdb uses the locale encoding for .pdbrc



Change by STINNER Victor :


--
nosy: +vstinner
nosy_count: 4.0 -> 5.0
pull_requests: +26909
pull_request: https://github.com/python/cpython/pull/28518

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue45261] Unreliable (?) results from timeit (cache issue?)



STINNER Victor  added the comment:

> But timing results are not like that, the measurement errors are 
one-sided, not two: (..)

I suggest you to run tons of benchmarks and look at the distribution. The 
reality is more complex than what you may think.


> measurement = true value + random error

In my experience, there is no single "true value", they are multiple values. 
Concrete example where the Python randomized hash function gives different 
value each time you spawn a new Python process:
https://vstinner.github.io/journey-to-stable-benchmark-average.html

Each process has its own "true value", but pyperf spawns 20 Python processes :-)

There are multiple sources of randomness, not only the Python randomized hash 
function. On Linux, the process address space is randomized by ASLR. I may give 
different timing at each run.

Code placement, exact memory address, etc. Many things enter into the game when 
you look into functions which take less than 100 ns.

Here is report is about a value lower than a single nanosecond: 
"0.08ns/element".

--

I wrote articles about benchmarking:
https://vstinner.readthedocs.io/benchmark.html#my-articles

I gave a talk about it:

* 
https://raw.githubusercontent.com/vstinner/talks/main/2017-FOSDEM-Brussels/howto_run_stable_benchmarks.pdf
* https://archive.fosdem.org/2017/schedule/event/python_stable_benchmark/

Again, good luck with benchmarking, it's a hard problem ;-)

--

Once you will consider that you know everything about benchmarking, you should 
read the following paper and cry:
https://arxiv.org/abs/1602.00602

See also my analysis of PyPy performance:
https://vstinner.readthedocs.io/pypy_warmups.html

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue28307] Accelerate 'string' % (value, ...) by using formatted string literals



STINNER Victor  added the comment:

commit a0bd9e9c11f5f52c7ddd19144c8230da016b53c6
Author: Serhiy Storchaka 
Date:   Sat May 8 22:33:10 2021 +0300

bpo-28307: Convert simple C-style formatting with literal format into 
f-string. (GH-5012)

C-style formatting with literal format containing only format codes
%s, %r and %a (with optional width, precision and alignment)
will be converted to an equivalent f-string expression.

It can speed up formatting more than 2 times by eliminating
runtime parsing of the format string and creating temporary tuple.

commit 8b010673185d36d13e69e5bf7d902a0b3fa63051
Author: Serhiy Storchaka 
Date:   Sun May 23 19:06:48 2021 +0300

bpo-28307: Tests and fixes for optimization of C-style formatting (GH-26318)

Fix errors:
* "%10.s" should be equal to "%10.0s", not "%10s".
* Tuples with starred expressions caused a SyntaxError.

--
nosy: +vstinner

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue45256] Remove the usage of the cstack in Python to Python calls



Change by STINNER Victor :


--
nosy: +vstinner

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue21302] time.sleep (floatsleep()) should use clock_nanosleep() on Linux



STINNER Victor  added the comment:


New changeset 58f8adfda3c2b42f654a55500e8e3a6433cb95f2 by Victor Stinner in 
branch 'main':
bpo-21302: time.sleep() uses waitable timer on Windows (GH-28483)
https://github.com/python/cpython/commit/58f8adfda3c2b42f654a55500e8e3a6433cb95f2


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue21302] time.sleep (floatsleep()) should use clock_nanosleep() on Linux



STINNER Victor  added the comment:

Livius: your first PR modified Sleep() in Modules/_tkinter.c to use 
nanosleep(). I don't see the point since this function has a solution of 1 ms 
(10^-3). Using select() on Unix is enough: resolution of 1 us (10^-6).

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue41266] IDLE call hints and completions confused by ints and floats

2021-09-22 Thread Terry J. Reedy



Terry J. Reedy  added the comment:

What is your point?  Code without explanation is useless.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue21302] time.sleep (floatsleep()) should use clock_nanosleep() on Linux



STINNER Victor  added the comment:

bench.py: measure the shortest possible sleep. Use time.sleep(1e-10): 0.1 
nanosecond. It should be rounded to the resolution of the used sleep function, 
like 1 ns on Linux.

On Linux with Fedora 34 Python 3.10 executable, I get:

Mean +- std dev: 60.5 us +- 12.9 us (80783 values)

On Windows with a Python 3.11 debug build, I get:

Mean +- std dev: 21.9 ms +- 7.8 ms (228 values)

Sadly, it seems like on Windows 10, one of the following function still uses 
the infamous 15.6 ms resolution:

* CreateWaitableTimerW()
* SetWaitableTimer()
* WaitForMultipleObjects()

--
Added file: https://bugs.python.org/file50294/bench.py

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue21302] time.sleep (floatsleep()) should use clock_nanosleep() on Linux



STINNER Victor  added the comment:

> On Windows with a Python 3.11 debug build, I get:
> Mean +- std dev: 21.9 ms +- 7.8 ms (228 values)

I wrote an optimization to cache the Windows timer handle between time.sleep() 
calls (don't close it). I don't think that it's needed because they shortest 
sleep is about 15.6 ms. CreateWaitableTimerW() is likely way more fast than 
15.6 ms. So this optimization is basically useless.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue21302] time.sleep (floatsleep()) should use clock_nanosleep() on Linux



STINNER Victor  added the comment:

Livius: do you care about using nanosleep(), or can I close the issue?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue40116] Regression in memory use of shared key dictionaries for "compact dicts"

2021-09-22 Thread Mark Shannon



Mark Shannon  added the comment:

This can be mitigated, if not entirely fixed, by storing an ordering bit vector 
in the values.

This way all instances of the class SometimesShared in the example above can 
share the keys.

The keys might be ("optional", "attr")

For any instances with only "attr" as an attibute, the values would be (NULL, 
value) and the order would be (1,)

The downsides of this approach are:
1. Each values, and thus dict needs an extra 64 bit value.
2. Shared keys have a maximum size of 16.

Overall, I expect the improved sharing to more than compensate for the 
disadvantages.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue45260] Implement superinstruction UNPACK_SEQUENCE_ST

2021-09-22 Thread zcpara



Change by zcpara :


--
keywords: +patch
pull_requests: +26910
stage:  -> patch review
pull_request: https://github.com/python/cpython/pull/28519

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue40116] Regression in memory use of shared key dictionaries for "compact dicts"

2021-09-22 Thread Mark Shannon



Change by Mark Shannon :


--
keywords: +patch
pull_requests: +26911
stage: test needed -> patch review
pull_request: https://github.com/python/cpython/pull/28520

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue45264] venv: Make activate et al. export custom prompt prefix as an envvar

2021-09-22 Thread John Wodder



New submission from John Wodder :

I use a custom script (and I'm sure many others have similar scripts as well) 
for setting my prompt in Bash.  It shows the name of the current venv (if any) 
by querying the `VIRTUAL_ENV` environment variable, but if the venv was created 
with a custom `--prompt`, it is unable to use this prompt prefix, as the 
`activate` script does not make this information available.

I thus suggest that the `activate` et al. scripts should set and export an 
environment variable named something like `VIRTUAL_ENV_PROMPT_PREFIX` that 
contains the prompt prefix (either custom or default) that venv would prepend 
to the prompt.  Ideally, this should be set even when 
`VIRTUAL_ENV_DISABLE_PROMPT` is set in case the user wants total control over 
their prompt.

(This was originally posted as an issue for virtualenv at 
, and it was suggested to post 
it here for feedback.)

--
components: Library (Lib)
messages: 402444
nosy: jwodder
priority: normal
severity: normal
status: open
title: venv: Make activate et al. export custom prompt prefix as an envvar
type: enhancement
versions: Python 3.11

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue45261] Unreliable (?) results from timeit (cache issue?)

Steven D'Aprano  added the comment:

Thank you Victor, I don't know whether I can convince you or you will 
convince me, but this is very interesting one way or the other. And I 
hope that Tim Holy is interested too :-)

On Wed, Sep 22, 2021 at 01:51:35PM +, STINNER Victor wrote:

> > But timing results are not like that, the measurement errors are 
> > one-sided, not two: (..)
> 
> I suggest you to run tons of benchmarks and look at the distribution. 
> The reality is more complex than what you may think.

We are trying to find the ideal time that the code would take with no 
external factors slowing it. No noise, no other processes running, no 
operating system overhead. The baseline time that it would take to run 
the code if the CPU could dedicate every cycle to your code and nothing 
else.

(Obviously we cannot measure that time on a real computer where there is 
always other processes running, but we want the best estimate of that 
time we can reach.)

There may still be conditions that can shift the baseline, e.g. the data 
fits in the CPU cache or it doesn't, but that's not *environmental* 
noise except so far as the impact of other processes might push your 
data out of the cache.

So in a noisy system (a real computer) there is some ideal time that the 
code would take to run if the CPU was dedicated to running your code 
only, and some actual time (ideal time + environmental noise) that we 
can measure.

The enviromental noise is never negative. That would mean that running 
more processes on the machine could make it faster. That cannot be.

> > measurement = true value + random error
> 
> In my experience, there is no single "true value", they are multiple 
> values. Concrete example where the Python randomized hash function 
> gives different value each time you spawn a new Python process:
>
> https://vstinner.github.io/journey-to-stable-benchmark-average.html
>
> Each process has its own "true value", but pyperf spawns 20 Python 
> processes :-)

Thank you for the link, I have read it.

Of course the more processes you spawn, the more environmental noise 
there is, which means more noise in your measurements.

If you run with 10 processes, or 100 processes, you probably won't get 
the same results for the two runs. (Sorry, I have not tried this myself. 
Consider this a prediction.)

You say:

"But how can we compare performances if results are random? Take the 
minimum? No! You must never (ever again) use the minimum for 
benchmarking! Compute the average and some statistics like the standard 
deviation"

but you give no reason for why you should not use the minimum. Without a 
reason why the minimum is *wrong*, I cannot accept that prohibition.

> Here is report is about a value lower than a single nanosecond: 
> "0.08ns/element".

An Intel i7 CPU performs at approximately 22 MIPS, so you can 
execute about 17 or 18 CPU instructions in 0.08 ns. (Depends on the 
instruction, of course.) AMD Ryzen Threadripper may be ten times as 
fast. Without knowing the CPU and code I cannot tell whether to be 
impressed or shocked at the time.

> Again, good luck with benchmarking, it's a hard problem ;-)

Thank you, and I do appreciate all the work you have put into it, even 
if you have not convinced me to stop using the minimum.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue45238] Fix debug() in IsolatedAsyncioTestCase


Łukasz Langa  added the comment:


New changeset ecb6922ff2d56476a6cfb0941ae55aca5e7fae3d by Serhiy Storchaka in 
branch 'main':
bpo-45238: Fix unittest.IsolatedAsyncioTestCase.debug() (GH-28449)
https://github.com/python/cpython/commit/ecb6922ff2d56476a6cfb0941ae55aca5e7fae3d


--
nosy: +lukasz.langa

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue45213] Frozen modules are looked up using a linear search.

2021-09-22 Thread Eric Snow



Eric Snow  added the comment:

On Wed, Sep 22, 2021 at 7:12 AM Dong-hee Na  wrote:
> I thought about the Trie implementation for this case.

On Wed, Sep 22, 2021 at 7:22 AM Marc-Andre Lemburg
 wrote:
> Perhaps a frozen dict could be used instead of the linear search.
>
> This could then also be made available as sys.frozen_modules for inspection 
> by applications and tools such as debuggers or introspection tools trying to 
> find source code (and potentially failing at this).

Both are worth exploring later.  FWIW, I was also considering
_Py_hashtable_t (from Include/internal/pycore_hashtable.h).

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue21302] time.sleep (floatsleep()) should use clock_nanosleep() on Linux



STINNER Victor  added the comment:

See also bpo-19007: "precise time.time() under Windows 8: use 
GetSystemTimePreciseAsFileTime".

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue45261] Unreliable (?) results from timeit (cache issue?)



STINNER Victor  added the comment:

> but you give no reason for why you should not use the minimum.

See https://pyperf.readthedocs.io/en/latest/analyze.html#minimum-vs-average

I'm not really interested to convince you. Use the minimum if you believe that 
it better fits your needs.

But pyperf will stick to the mean to get more reproducible benchmark results ;-)

There are many believes and assumptions made by people running benchmarks. As I 
wrote, do your own experiments ;-) Enjoy multimodal distributions ;-)

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue45261] Unreliable (?) results from timeit (cache issue?)



STINNER Victor  added the comment:

I suggest to close this issue. timeit works as expected. It has limitations, 
but that's ok. Previous attempts to enhance timeit were rejected.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue45261] Unreliable (?) results from timeit (cache issue?)

2021-09-22 Thread Tim Holy



Tim Holy  added the comment:

To make sure it's clear, it's not 0.08ns/function call, it's a loop and it 
comes out to 0.08ns/element. The purpose of quoting that number was to compare 
to the CPU clock interval, which on my machine is ~0.4ns. Any operation that's 
less than 1 clock cycle is suspicious, but not automatically wrong because of 
SIMD (if the compiler generates such instructions for this operation, but I'm 
not sure how one checks that in Python). On my AVX2 processor, as many as 16 
`uint16` values could fit simultaneously, and so you can't entirely rule out 
times well below one clock cycle (although the need for load, manipulation, 
store, and increment means that its not plausible to be 1/16th of the clock 
cycle).

Interestingly, increasing `number` does seem to make it consistent, without 
obvious transitions. I'm curious why the reported times are not "per number"; I 
find myself making comparisons using

list(map(lambda tm : tm / 1000, t.repeat(repeat=nrep, number=1000)))

Should the documentation mention that the timing of the core operation should 
be divided by `number`?

However, in the bigger picture of things I suspect this should be closed. I'll 
let others chime in first, in case they think documentation or other things 
need to be changed.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue45256] Remove the usage of the C stack in Python to Python calls



Change by STINNER Victor :


--
title: Remove the usage of the cstack in Python to Python calls -> Remove the 
usage of the C stack in Python to Python calls

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue45238] Fix debug() in IsolatedAsyncioTestCase


Change by Łukasz Langa :


--
pull_requests: +26912
pull_request: https://github.com/python/cpython/pull/28521

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue45026] More compact range iterator


Łukasz Langa  added the comment:

Since len timings for ranges of 100 items are negligible anyway, I personally 
still favor GH-28176 which is clearly faster during iteration.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue45261] Unreliable (?) results from timeit (cache issue?)



Marc-Andre Lemburg  added the comment:

On the topic average vs. minimum, it's interesting to see this pop up
every now and then. When I originally wrote pybench late in 1997, I used
average, since it gave good results on my PC at the time.

Later on, before pybench was added to Tools/ in Python 2.5 in 2006, people
started complaining about sometimes getting weird results (e.g. negative
times due to the calibration runs not being stable enough). A lot of noise
was made from the minimum fans, so I switched to minimum, which then made
the results more stable, but I left in the average figures as well.

Then some years later, people complained about pybench not being
good enough for comparing to e.g. PyPy, since those other implementations
optimized away some of the micro-benchmarks which were used in pybench.
It was then eventually removed, to not confuse people not willing
to try to understand what the benchmark suite was meant for, nor
understand the issues around running such benchmarks on more modern
CPUs.

CPUs have advanced a lot since the days pybench was written and so
reliable timings are not easy to get unless you invest in dedicated
hardware, custom OS and CPU settings and lots of time to calibrate
everything. See Victor's research for more details.

What we have here is essentially the same issue. timeit() is mostly
being used for micro-benchmarks, but those need to be run in dedicated
environments. timeit() is good for quick checks, but not really up
to the task of providing reliable timing results.

One of these days, we should ask the PSF or one of its sponsors to
provide funding and devtime to set up such a reliable testing
environment. One which runs not only high end machines, but also
average and lower end machines, and using different OSes,
so that we can detect performance regressions early and easily
on different platforms.

--
nosy: +lemburg

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue45238] Fix debug() in IsolatedAsyncioTestCase


Change by Łukasz Langa :


--
pull_requests: +26913
pull_request: https://github.com/python/cpython/pull/28522

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue45238] Fix debug() in IsolatedAsyncioTestCase


Łukasz Langa  added the comment:


New changeset 44396aaba9b92b3a38a4b422a000d1a8eb05185a by Łukasz Langa in 
branch '3.10':
[3.10] bpo-45238: Fix unittest.IsolatedAsyncioTestCase.debug() (GH-28449) 
(GH-28521)
https://github.com/python/cpython/commit/44396aaba9b92b3a38a4b422a000d1a8eb05185a


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue45238] Fix debug() in IsolatedAsyncioTestCase


Łukasz Langa  added the comment:


New changeset e06b0fddf69b933fe82f60d78a0f6248ca36a0a3 by Łukasz Langa in 
branch '3.9':
[3.9] bpo-45238: Fix unittest.IsolatedAsyncioTestCase.debug() (GH-28449) 
(GH-28522)
https://github.com/python/cpython/commit/e06b0fddf69b933fe82f60d78a0f6248ca36a0a3


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue45238] Fix debug() in IsolatedAsyncioTestCase


Łukasz Langa  added the comment:

Thanks, Serhiy! ✨ 🍰 ✨

--
resolution:  -> fixed
stage: patch review -> resolved
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue41203] Replace references to OS X in documentation with macOS

2021-09-22 Thread miss-islington



Change by miss-islington :


--
pull_requests: +26915
pull_request: https://github.com/python/cpython/pull/28524

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue41203] Replace references to OS X in documentation with macOS

2021-09-22 Thread miss-islington



Change by miss-islington :


--
nosy: +miss-islington
nosy_count: 6.0 -> 7.0
pull_requests: +26914
pull_request: https://github.com/python/cpython/pull/28523

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue41203] Replace references to OS X in documentation with macOS

2021-09-22 Thread Serhiy Storchaka



Serhiy Storchaka  added the comment:


New changeset 36122e18148c5b6c78ebce1d36d514fd7cf250f5 by Serhiy Storchaka in 
branch 'main':
bpo-41203: Replace Mac OS X and OS X with macOS (GH-28515)
https://github.com/python/cpython/commit/36122e18148c5b6c78ebce1d36d514fd7cf250f5


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue45265] Bug in append() method in order to appending a temporary list to a empty list using for loop

2021-09-22 Thread Nima



New submission from Nima :

I want to make an list consist on subsequences of another list,
for example:
--
input: array = [4, 1, 8, 2]
output that i expect: [[4], [4,1], [4, 1, 8], [4, 1, 8, 2]]
--
but output is: [[4, 1, 8, 2], [4, 1, 8, 2], [4, 1, 8, 2],  [4, 1, 8, 2]]
--
my code is:

num = [4, 1, 8, 2]
temp = []
sub = []
for item in num:
temp.append(item)
sub.append(temp)
--
i think it's a bug because, append() must add an item to the end of the list, 
but here every time it add the temp to the sub it changes the previous item as 
well, so if it's not a bug please help me to write the correct code.

--
components: Windows
messages: 402458
nosy: nima_fl, paul.moore, steve.dower, tim.golden, zach.ware
priority: normal
severity: normal
status: open
title: Bug in append() method in order to appending a temporary list to a empty 
list using for loop
type: behavior

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue40027] re.sub inconsistency beginning with 3.7

2021-09-22 Thread Brian



Brian  added the comment:

I just ran into this change in behavior myself.

It's worth noting that the new behavior appears to match perl's behavior:

# perl -e 'print(("he" =~ s/e*\Z/ah/rg), "\n")'
hahah

--
nosy: +bsammon

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue45265] Bug in append() method in order to appending a temporary list to a empty list using for loop



Raymond Hettinger  added the comment:

This is expected behavior.

To get a better understanding, see:  
https://docs.python.org/3/faq/programming.html#why-did-changing-list-y-also-change-list-x

To get help writing correct code, please do not use the bug tracker.  Consider 
posting to StackOverflow instead.

--
nosy: +rhettinger
resolution:  -> not a bug
stage:  -> resolved
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue40027] re.sub inconsistency beginning with 3.7

2021-09-22 Thread Brian



Brian  added the comment:

txt = ' test'
txt = re.sub(r'^\s*', '^', txt)

substitutes once because the * is greedy.

txt = ' test'
txt = re.sub(r'^\s*?', '^', txt)

substitutes twice, consistent with the \Z behavior.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue45261] Unreliable (?) results from timeit (cache issue?)

2021-09-22 Thread Tim Holy



Tim Holy  added the comment:

> And I hope that Tim Holy is interested too :-)

Sure, I'll bite :-). On the topic of which statistic to show, I am a real fan 
of the histogram. As has been pointed out, timing in the real world can be 
pretty complicated, and I don't think it does anyone good to hide that 
complexity. Even in cases where machines aren't busy doing other things, you 
can get weird multimodal distributions. A great example (maybe not relevant to 
a lot of Python benchmarks...) is in multithreaded algorithms where the main 
thread is both responsible for scheduling other threads and for a portion of 
the work. Even in an almost-idle machine, you can get little race conditions 
where the main thread decides to pick up some work instants before another 
thread starts looking for more work. That can generate peaks in the histogram 
that are separated by the time for one "unit" of work.

But if you have to quote one and only one number, I'm a fan of the minimum (as 
long as you can trust it---which relies on assuming that you've accurately 
calibrated away all the overhead of your timing apparatus, and not any more).

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue45020] Freeze all modules imported during startup.

2021-09-22 Thread Brett Cannon



Brett Cannon  added the comment:

What about if there isn't a pre-computed location for __file__? I could imagine 
a self-contained CPython build where there is no concept of a file location on 
disk for anything using this.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue45020] Freeze all modules imported during startup.

Marc-Andre Lemburg  added the comment:

On 22.09.2021 20:47, Brett Cannon wrote:
> What about if there isn't a pre-computed location for __file__? I could 
> imagine a self-contained CPython build where there is no concept of a file 
> location on disk for anything using this.

This does work and is enough to make most code out there happy.

I use e.g. "/os.py" in PyRun. There is no os.py file to load,
but tracebacks and inspection tools work just fine with this.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue40116] Regression in memory use of shared key dictionaries for "compact dicts"



Raymond Hettinger  added the comment:

> Overall, I expect the improved sharing to more than
> compensate for the disadvantages.

I expect the opposite.  This makes all dicts pay a price (in space, 
initialization time, and complexity) for a micro-optimization of an uncommon 
case (the normal case is for __init__ to run and set all the keys in a 
consistent order).  It is unlikely that the "benefits" to never be felt in 
real-word applications, but "disadvantages" would affect every Python program.

> The language specification says that the dicts maintain insertion 
> order, but the wording implies that this only to explicit 
> dictionaries, not instance attribute or other namespace dicts.

That is a quite liberal reading of the spec.  I would object to making instance 
and namespace dicts behave differently.  That would be a behavior regression 
and we would forever have to wrestle with the difference.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue45266] subtype_clear can not be called from derived types

2021-09-22 Thread Victor Milovanov



New submission from Victor Milovanov :

I am trying to define a type in C, that derives from PyTypeObject.

I want to override tp_clear. To do so properly, I should call base type's 
tp_clear and have it perform its cleanup steps. PyTypeObject has a tp_clear 
implementation: subtype_clear. Problem is, it assumes the instance it gets is 
of a type, that does not override PyTypeObject's tp_clear, and behaves 
incorrectly in 2 ways:

1) it does not perform the usual cleanup, because in this code
base = type;
while ((baseclear = base->tp_clear) == subtype_clear)

the loop condition is immediately false, as my types overrode tp_clear

2) later on it calls baseclear on the same object. But because of the loop 
above baseclear actually points to my type's custom tp_clear implementation, 
which leads to reentry to that function (basically a stack overflow, unless 
there's a guard against it).

--
components: C API
messages: 402466
nosy: Victor Milovanov
priority: normal
severity: normal
status: open
title: subtype_clear can not be called from derived types
type: behavior
versions: Python 3.10, Python 3.11, Python 3.6, Python 3.7, Python 3.8, Python 
3.9

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue45266] subtype_clear can not be called from derived types

2021-09-22 Thread Victor Milovanov



Victor Milovanov  added the comment:

To put it differently, if you think in terms of MRO, my custom type's MRO is

my_type_clear (from my type), subtype_clear (from PyTypeObject), etc

And subtype_clear incorrectly assumes that it is the first entry in the 
object's MRO list for tp_clear.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue40116] Regression in memory use of shared key dictionaries for "compact dicts"

Marc-Andre Lemburg  added the comment:

On 22.09.2021 21:02, Raymond Hettinger wrote:
>> The language specification says that the dicts maintain insertion 
>> order, but the wording implies that this only to explicit 
>> dictionaries, not instance attribute or other namespace dicts.
> 
> That is a quite liberal reading of the spec.  I would object to making 
> instance and namespace dicts behave differently.  That would be a behavior 
> regression and we would forever have to wrestle with the difference.

I agree. Keeping the insertion order is essential for many common
use cases, including those where a class or instance dict is used,
e.g. namespaces used for data records, data caches, field
definitions in data records, etc. (and yes, those often can be
dynamically extended as well :-)).

I think for the case you mention, a documentation patch would be
better and more helpful for the programmers. Point them to slots
and the sharing problem should go away in most cases :-)

--
nosy: +lemburg

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue45262] crash if asyncio is used before and after re-initialization if using python embedded in an application

2021-09-22 Thread Andrew Svetlov



Andrew Svetlov  added the comment:

I guess the fix requires switching C Extension types from static to heap for 
_asyncio module.
It is possible for sure but requires a non-trivial amount of work.

We need a champion for the issue.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue43976] Allow Python distributors to add custom site install schemes

2021-09-22 Thread Jason R. Coombs



Jason R. Coombs  added the comment:

Here's what I propose:

1. In pypa/distutils, add support for honoring the proposed install schemes 
(based on PR 25718). Merge with Setuptools.
2. Add whatever ugly hacks are needed to pypa/distutils to honor other 
Debian-specific behaviors (revive https://github.com/pypa/distutils/pull/4 but 
properly match Debian expectations). Mark these ugly hacks as deprecated.
3. In Debian, Fedora, etc, provide patches that configure the install schemes. 
Test with latest Setuptools and SETUPTOOLS_USE_DISTUTILS=local.
4. Formalize the install schemes support in CPython (as in PR 25718).
5. Ask Debian to propose more a cleaner interface for Debian-specific needs.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue39549] The reprlib.Repr type should permit the “fillvalue” to be set by the user


Raymond Hettinger  added the comment:


New changeset 8c21941ddafdf4925170f9cea22e2382dd3b0184 by Alexander Böhn in 
branch 'main':
bpo-39549: reprlib.Repr uses a “fillvalue” attribute (GH-18343)
https://github.com/python/cpython/commit/8c21941ddafdf4925170f9cea22e2382dd3b0184


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue39549] The reprlib.Repr type should permit the “fillvalue” to be set by the user



Change by Raymond Hettinger :


--
resolution:  -> fixed
stage: patch review -> resolved
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue42969] pthread_exit & PyThread_exit_thread from PyEval_RestoreThread etc. are harmful

2021-09-22 Thread Jeremy Maitin-Shepard



Change by Jeremy Maitin-Shepard :


--
keywords: +patch
pull_requests: +26916
stage:  -> patch review
pull_request: https://github.com/python/cpython/pull/28525

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue45267] New install Python 3.9.7 install of Sphinx Document Generator fails

2021-09-22 Thread Paul Broe


New submission from Paul Broe :

Brand new build of Python 3.9.7 on RHEL 7.  Placed in /usr/local/python3

Created new python environment

cd /usr/opt/oracle/
 python3 -m venv py3-sphinx
source /usr/opt/oracle/py3-sphinx/bin/activate
Now verify that python is now linked to Python 3.
In this virtual environment python  python3
python -V
Python 3.9.7

I installed all the pre-requisites correctly for Sphinx 4.2.0
See the output of the command in the attached file
command:
pip install -vvv --no-index --find-link=/usr/opt/oracle/downloads/python-addons 
sphinx

--
components: Demos and Tools
files: Sphinx install output.txt
messages: 402472
nosy: pcbroe
priority: normal
severity: normal
status: open
title: New install Python 3.9.7 install of Sphinx Document Generator fails
versions: Python 3.9
Added file: https://bugs.python.org/file50295/Sphinx install output.txt

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue45234] copy_file raises FileNotFoundError when src is a directory