Inada Naoki added the comment:
New changeset 4216dce04b7d3f329beaaafc82a77c4ac6cf4d57 by Inada Naoki in branch
'main':
bpo-47000: Make `io.text_encoding()` respects UTF-8 mode (GH-32003)
https://github.com/python/cpython/commit/4216dce04b7d3f329beaaafc82a77c
Inada Naoki added the comment:
> Please see https://bugs.python.org/issue47000#msg415769 for what Victor
> suggested.
Of course, I read it.
> In particular, the locale module uses the "no underscore" convention.
> Not sure whether it's good to start using snake c
Inada Naoki added the comment:
@vstiner Since UTF-8 mode affects `locale.getpreferredencoding(False)`, I need
to decide alternative API in the PEP 686.
If no objections, I will choose `locale.get_encoding()` for current locale
encoding (ACP on Windows).
See https://github.com/python/peps
Inada Naoki added the comment:
OK. Cache efficiency is dropped from motivations list.
Current motivations are:
* Memory saving (currently, 4 BytesObject (= 32 bytes of ob_shash) per code
object.
* Make bytes objects immutable
* Share objects among multi interpreters.
* CoW efficiency.
I
Inada Naoki added the comment:
> I guess not much difference in benchmarks.
> But if put a bytes object into multiple dicts/sets, and len(bytes_key) is
> large, it will take a long time. (1 GiB 0.40 seconds on i5-11500 DDR4-3200)
> The length of bytes can be arbitrary,so computing
Inada Naoki added the comment:
First of all, this is just deprecating direct access of `ob_shash`. This makes
users need to use `PyObject_Hash()`.
We don't make the final decision about removing it. We just make we can remove
it in Python 3.13.
RAM and CACHE efficiency is not the
Inada Naoki added the comment:
I am not sure about we really need "locale encoding at Python startup".
For this issue, I don't want to change `encoding="locale"` behavior except
ignore UTF-8 mode. So what I want is "current locale encoding" or
ANSI code
Inada Naoki added the comment:
New changeset 894d0ea5afa822c23286e9e68ed80bb1122b402d by Inada Naoki in branch
'main':
bpo-46864: Suppress deprecation warnings for ob_shash. (GH-32042)
https://github.com/python/cpython/commit/894d0ea5afa822c23286e9e68ed80b
Inada Naoki added the comment:
Average RAM capacity doesn't grow as CPU cores grows.
Additionally, L1+L2 cache is really limited resource compared to CPU or RAM.
Bytes object is used for co_code that is hot. So cache efficiency is important.
Would you give us more realistic (or real
Change by Inada Naoki :
--
pull_requests: +30157
pull_request: https://github.com/python/cpython/pull/32068
___
Python tracker
<https://bugs.python.org/issue47
Inada Naoki added the comment:
> * sys.getfilesystemencoding(): Python filesystem encoding, return "UTF-8" if
> the Python UTF-8 Mode is enabled
Yes, althoguh PYTHONLEGACYWINDOWSFSENCODING takes priority.
> * locale.getencoding(): Get the locale encoding, LC_CTYPE locale
Inada Naoki added the comment:
Since the hash is randomized, using hash(bytes) for such use case is not
recommended. User should use stable hash functions instead.
I agree that there is few use cases this change cause performance regression.
But it is really few compared to overhead of
Inada Naoki added the comment:
Since Python 3.13, yes. It will be bit slower.
--
___
Python tracker
<https://bugs.python.org/issue46864>
___
___
Python-bug
Inada Naoki added the comment:
I'm sorry. Maybe, ccache hides the warning from me.
--
___
Python tracker
<https://bugs.python.org/issue46864>
___
___
Pytho
Change by Inada Naoki :
--
pull_requests: +30132
stage: needs patch -> patch review
pull_request: https://github.com/python/cpython/pull/32042
___
Python tracker
<https://bugs.python.org/issu
Inada Naoki added the comment:
> As you can see, the location of the failing test in the log is masked, and
> instead the description is present.
Could you elaborate?
```
test_index_empty (idlelib.idle_test.test_text.MockTextTest)
Failing test with bad description. ... ERROR
Change by Inada Naoki :
--
keywords: +patch
pull_requests: +30091
stage: -> patch review
pull_request: https://github.com/python/cpython/pull/32003
___
Python tracker
<https://bugs.python.org/issu
Inada Naoki added the comment:
Thank you. I agree that inlining is worth enough.
But we already inlined too many functions in ceval and there is an issue caused
by it... (bpo-45116)
--
___
Python tracker
<https://bugs.python.org/issue47
Inada Naoki added the comment:
I created another topic relating this issue.
https://discuss.python.org/t/add-legacy-text-encoding-option-to-make-utf-8-default/14281
If we add another option (e.g. legacy_text_encoding), we do not need to change
UTF-8 mode behavior
Inada Naoki added the comment:
I know chm is handy. But Microsoft abandoned it already.
I think we should stop providing chm.
--
___
Python tracker
<https://bugs.python.org/issue35
Inada Naoki added the comment:
Hmm. Would you measure benefit from inlining and skipping incref/decref
separately?
If benefit of inlining is very small, making _PyList_AppendTakeRef() as regular
internal API looks better to me.
--
nosy: +methane
Inada Naoki added the comment:
I created a related topic on discuss.python.org.
https://discuss.python.org/t/jep-400-utf-8-by-default-and-future-of-python/14246
If we recommend `PYTHONUTF8` as opt-in "UTF-8 by default", `encoding="locale"`
should locale encoding in UTF-
Inada Naoki added the comment:
Thanks.
--
___
Python tracker
<https://bugs.python.org/issue39829>
___
___
Python-bugs-list mailing list
Unsubscribe:
Change by Inada Naoki :
--
resolution: -> fixed
stage: patch review -> resolved
status: open -> closed
___
Python tracker
<https://bugs.python.or
Inada Naoki added the comment:
New changeset 2153daf0a02a598ed5df93f2f224c1ab2a2cca0d by Crowthebird in branch
'main':
bpo-39829: Fix `__len__()` is called twice in list() constructor (GH-31816)
https://github.com/python/cpython/commit/2153daf0a02a598ed5df93f2f224c1
New submission from Inada Naoki :
Currently, `encoding="locale"` is just shortcut of
`encoding=locale.getpreferredencoding(False)`.
`encoding="locale"` means that "locale encoding should be used here, even if
Python default encoding is changed to UTF-8".
Inada Naoki added the comment:
> Changes compared here:
> https://github.com/python/cpython/compare/main...thatbirdguythatuknownot:patch-17
Looks good to me. Would you create a pull request?
--
___
Python tracker
<https://bugs.p
Inada Naoki added the comment:
Relating issue: https://twitter.com/nedbat/status/1489233208713437190
Current overallocation strategy is rough. We need to make it more smooth.
--
versions: +Python 3.11 -Python 3.9
___
Python tracker
<ht
Change by Inada Naoki :
--
nosy: +methane
___
Python tracker
<https://bugs.python.org/issue43574>
___
___
Python-bugs-list mailing list
Unsubscribe:
Change by Inada Naoki :
--
nosy: +methane
___
Python tracker
<https://bugs.python.org/issue39829>
___
___
Python-bugs-list mailing list
Unsubscribe:
Inada Naoki added the comment:
I don't know much about Java, but Java's WeakHashMap is same to Python's
WeakKeyDictionary.
https://docs.oracle.com/javase/9/docs/api/java/util/WeakHashMap.html
"""
This class is intended primarily for use with key objects whose e
Change by Inada Naoki :
--
resolution: -> not a bug
stage: patch review -> resolved
status: open -> closed
___
Python tracker
<https://bugs.python.or
Inada Naoki added the comment:
New changeset 2d8b764210c8de10893665aaeec8277b687975cd by Inada Naoki in branch
'main':
bpo-46864: Deprecate PyBytesObject.ob_shash. (GH-31598)
https://github.com/python/cpython/commit/2d8b764210c8de10893665aaeec827
Change by Inada Naoki :
--
resolution: -> fixed
stage: patch review -> resolved
status: open -> closed
___
Python tracker
<https://bugs.python.or
Change by Inada Naoki :
--
resolution: -> rejected
stage: patch review -> resolved
status: open -> closed
___
Python tracker
<https://bugs.python.or
Inada Naoki added the comment:
OK. By quick grepping, I found only msgpack and bitstruct use these API.
It is not enough number to make them public.
--
___
Python tracker
<https://bugs.python.org/issue46
Inada Naoki added the comment:
New changeset 4f74052b455a54ac736f38973693aeea2ec14116 by Inada Naoki in branch
'main':
bpo-40116: dict: Add regression test for iteration order. (GH-31550)
https://github.com/python/cpython/commit/4f74052b455a54ac736f38973693ae
Change by Inada Naoki :
--
keywords: +patch
pull_requests: +29769
stage: -> patch review
pull_request: https://github.com/python/cpython/pull/31649
___
Python tracker
<https://bugs.python.org/issu
Change by Inada Naoki :
--
nosy: +methane
___
Python tracker
<https://bugs.python.org/issue46903>
___
___
Python-bugs-list mailing list
Unsubscribe:
New submission from Inada Naoki :
Original issue. https://github.com/msgpack/msgpack-python/issues/497
_PyFloat_(Pack|Unpack)(4|8) is very nice API for serializers like msgpack.
Converting double and float into char[] is not trivial and these APIs do it in
very efficient way.
And these APIs
Change by Inada Naoki :
--
resolution: -> fixed
stage: patch review -> resolved
status: open -> closed
___
Python tracker
<https://bugs.python.or
Inada Naoki added the comment:
New changeset 9833bb91e4d5c2606421d9ec2085f5c2dfb6f72c by Inada Naoki in branch
'main':
bpo-46845: Reduce dict size when all keys are Unicode (GH-31564)
https://github.com/python/cpython/commit/9833bb91e4d5c2606421d9ec2085f5
Inada Naoki added the comment:
Can we use --lto=thin when availabe?
And can we not use --lto when building profiling python?
--
nosy: +methane
___
Python tracker
<https://bugs.python.org/issue45
Inada Naoki added the comment:
When removed shash:
```
## small key
$ ./python -m pyperf timeit --compare-to ../cpython/python -s 'd={b"foo":1,
b"bar":2, b"buzz":3}' -- 'b"key" in d'
/home/inada-n/work/python/cpython/python: ..
Inada Naoki added the comment:
I added _PyDict_FromItems() to the PR.
It checks that all keys are Unicode or not before creating dict.
_PyDict_NewPresized() just returns general-purpose dict. But it isn't used from
CPython core. It is just kept for compatibility (for Cython).
```
$ ./p
Inada Naoki added the comment:
> But some programs can still work with encoded bytes instead of strings. In
> particular os.environ and os.environb are implemented as dict of bytes on
> non-Windows.
This change doesn't affect to os.environ.
os.environ[key] do
Inada Naoki added the comment:
In most case, first PyDict_SetItem decides which format should be used.
But _PyDict_NewPresized() can be a problem. It creates a hash table before
inserting the first key, when 5 < (expected size) < 87382.
In CPython code base, _PyDict_NewPresized() is
Change by Inada Naoki :
--
keywords: +patch
pull_requests: +29721
stage: -> patch review
pull_request: https://github.com/python/cpython/pull/31598
___
Python tracker
<https://bugs.python.org/issu
New submission from Inada Naoki :
Code objects have more and more bytes attributes for now.
To reduce the RAM by code, I want to remove ob_shash (cached hash value) from
bytes object.
Sets and dicts have own hash cache.
Unless checking same bytes object against dicts/sets many times, this
Inada Naoki added the comment:
>
>
> Do you propose to
> 1. Only use StringKeyDicts when non-string keys are not possible? (Where
> would this be?)
> 2. Switch to a normal dict when a non-string key is added? (But likely
> not switch back when the last non-string
Change by Inada Naoki :
--
keywords: +patch
pull_requests: +29686
stage: -> patch review
pull_request: https://github.com/python/cpython/pull/31564
___
Python tracker
<https://bugs.python.org/issu
Inada Naoki added the comment:
New changeset ad6c7003e38a9f8bdf8d865fb5fa0f3c03690315 by Inada Naoki in branch
'main':
bpo-46606: Remove redundant +1. (GH-31561)
https://github.com/python/cpython/commit/ad6c7003e38a9f8bdf8d865fb5fa0f
Change by Inada Naoki :
--
resolution: -> rejected
stage: patch review -> resolved
status: open -> closed
___
Python tracker
<https://bugs.python.or
Change by Inada Naoki :
--
pull_requests: +29684
pull_request: https://github.com/python/cpython/pull/31561
___
Python tracker
<https://bugs.python.org/issue46
Inada Naoki added the comment:
PyDict_Keys(), PyDict_Values(), and PyDict_Items() don't respect insertion
order too.
--
___
Python tracker
<https://bugs.python.org/is
New submission from Inada Naoki :
Currently, PyDictKeyEntry is 24bytes (hash, key, and value).
We can drop the hash from entry when all keys are unicode, because unicode
objects caches hash already.
This will cause some performance regression on microbenchmark because dict need
one more
Change by Inada Naoki :
--
pull_requests: +29671
pull_request: https://github.com/python/cpython/pull/31550
___
Python tracker
<https://bugs.python.org/issue40
Inada Naoki added the comment:
I found regression caused by GH-28520.
```
class C:
def __init__(self, n):
if n:
self.a = 1
self.b = 2
self.c = 3
else:
self.c = 1
self.b = 2
self.a = 3
o1 = C(True)
o2
Inada Naoki added the comment:
All of these optimizations should be disabled by default.
* It will cause leak when Python is embedded.
* Even for python command, it will break __del__ and weakref callbacks.
--
___
Python tracker
<ht
Change by Inada Naoki :
--
resolution: -> fixed
stage: patch review -> resolved
status: open -> closed
___
Python tracker
<https://bugs.python.or
Inada Naoki added the comment:
New changeset 74127b89a8224d021fc76f679422b76510844ff9 by Inada Naoki in branch
'main':
bpo-46606: Reduce stack usage of getgroups and setgroups (GH-31073)
https://github.com/python/cpython/commit/74127b89a8224d021fc76f679422b7
Inada Naoki added the comment:
As I commented in https://github.com/faster-cpython/ideas/discussions/288, your
benchmark is not fair.
Include `{}` and `{}.resize(len(cases))` into the measured function.
--
nosy: +methane
___
Python tracker
Inada Naoki added the comment:
> Generally speaking, parsing some things as decimal or datetime are schema
> dependent.
Totally agree with this.
> In order to provide maximal flexibility it would be much nicer to have a
> streaming interface available (like SAX for XML parsin
Inada Naoki added the comment:
I think making more objects immortal by default will reduce the gap, although I
am not sure it can be 2%. (I guess 3% and I think it is acceptable gap.)
* Code attributes (contents of co_consts, co_names, etc...) in deep frozen
modules.
* only if
Inada Naoki added the comment:
Thank you, I can not find it because it is too old.
--
resolution: -> duplicate
stage: patch review -> resolved
status: open -> closed
superseder: -> Add sys.isinterned()
___
Python tracker
<https://
Inada Naoki added the comment:
I thought sys.is_interned() is needed to implement bpo-46430, but GH-30683
looks nice to me.
I will close this issue after GH-30683 is merged.
--
___
Python tracker
<https://bugs.python.org/issue46
Change by Inada Naoki :
--
keywords: +patch
pull_requests: +29397
stage: -> patch review
pull_request: https://github.com/python/cpython/pull/31227
___
Python tracker
<https://bugs.python.org/issu
New submission from Inada Naoki :
deepfreeze.py needs to know the unicode object is interned.
Ref: https://bugs.python.org/issue46430
--
components: Interpreter Core
messages: 412890
nosy: methane
priority: normal
severity: normal
status: open
title: Add sys.is_interned
versions
Inada Naoki added the comment:
I didn't mean _Py_abspath is problem. I just used it to describe why -O0 and
-Og is so different.
We can reduce stack usage of it easily, but it is not a problem than
_PyEval_EvalFrameDefault.
It is difficult to reduce stack usage of _PyEval_EvalFrameDe
Change by Inada Naoki :
--
keywords: +patch
pull_requests: +29257
stage: -> patch review
pull_request: https://github.com/python/cpython/pull/31073
___
Python tracker
<https://bugs.python.org/issu
New submission from Inada Naoki :
I checked stack usage for bpo-46600 and found this two functions use a lot of
stack.
os_setgroups: 262200 bytes
os_getgroups_impl: 262184 bytes
Both function has local variable like this:
gid_t grouplist[MAX_GROUPS];
MAX_GROUPS is defined as
Inada Naoki added the comment:
FWIW, it seems -O0 don't merge local variables in different path or lifetime.
For example, see _Py_abspath
```
if (path[0] == '\0' || !wcscmp(path, L".")) {
wchar_t cwd[MAXPATHLEN + 1];
//(snip)
}
//(snip)
Change by Inada Naoki :
--
resolution: -> fixed
stage: patch review -> resolved
status: open -> closed
___
Python tracker
<https://bugs.python.or
Inada Naoki added the comment:
No. I just waiting Python 3.11 become Bata.
--
___
Python tracker
<https://bugs.python.org/issue36346>
___
___
Python-bugs-list m
Inada Naoki added the comment:
We do not have *fill* since Python 3.6.
There is a `dk_nentries` instead. But when `insertion_resize()` is called,
`dk_nentries` is equal to `USABLE_FRACTION(dk_size)` (dk_size is `1 <<
dk_log2_size` for now). So it is different from *fill* in the old di
Change by Inada Naoki :
--
nosy: +methane
___
Python tracker
<https://bugs.python.org/issue44723>
___
___
Python-bugs-list mailing list
Unsubscribe:
Inada Naoki added the comment:
> The only way to safely launch worker processes on demand is to spawn a worker
> launcher process spawned prior to any thread creation that remains idle, with
> a sole job of spawn new worker processes for us. That sounds complicated.
> That
Inada Naoki added the comment:
> If we literally ignore the attribute, any usage of `.mapping` will be an
> error, which basically makes the whole `.mapping` feature useless for
> statically typed code. It also wouldn't appear in IDE autocompletions.
`.mapping` is not exist
Inada Naoki added the comment:
In other words,
a. If `.keys()` in all dict subclasses must return subclass of `dict_keys`:
`dict.keys() -> dict_keys`.
b. If `.keys().mapping` must be accessible for all dict subclasses: Add
`.mapping` to `KeysView`.
c. If `.keys().mapping` is optional
Inada Naoki added the comment:
> I agree with Inada that not every internal type should be exposed, but I
> would make an exception for the dict views classes due to the fact that dict
> subclasses are much more common than subclasses of other mappings, such as
> OrderedDict. I
Inada Naoki added the comment:
I am not happy about exposing every internal types. I prefer duck typing.
Like OrderedDict, not all dict subtypes uses `dict_keys`, `dict_views`, and
`dict_items`.
If typeshed annotate dict.keys() returns `dict_keys`, "incompatible override"
c
Change by Inada Naoki :
--
nosy: +methane
nosy_count: 4.0 -> 5.0
pull_requests: +28860
pull_request: https://github.com/python/cpython/pull/30659
___
Python tracker
<https://bugs.python.org/issu
Inada Naoki added the comment:
Mercurial still use it.
https://www.mercurial-scm.org/repo/hg-stable/file/tip/mercurial/pycompat.py#l113
Mercurial has plan to move filesystem name from ANSI Code Page to UTF-8, but I
don't know about its progress.
https://www.mercurial-scm.org
Inada Naoki added the comment:
collections.abc.Mapping is fixed by https://bugs.python.org/issue43977
We can be same thing if backward compatibility allows it.
--
nosy: +methane
___
Python tracker
<https://bugs.python.org/issue46
Inada Naoki added the comment:
New changeset 0b2b9d251374c5ed94265e28039f82b37d039e3e by Inada Naoki in branch
'main':
bpo-23882: unittest: Drop PEP 420 support from discovery. (GH-29745)
https://github.com/python/cpython/commit/0b2b9d251374c5ed94265e28039f82
Inada Naoki added the comment:
I don't against deep freezing functools and contextlib.
But I think we should optimize and utilize zipimport or something similar,
because we can not deep-freeze all stdlib or 3rd party libraries.
See also:
https://github.com/faster-cpython/ideas/discus
Change by Inada Naoki :
--
keywords: +patch
pull_requests: +28615
stage: -> patch review
pull_request: https://github.com/python/cpython/pull/30409
___
Python tracker
<https://bugs.python.org/issu
Inada Naoki added the comment:
UTF-8 mode is not enabled by default. So locale encoding is still the default
encoding.
--
nosy: +methane
___
Python tracker
<https://bugs.python.org/issue46
Inada Naoki added the comment:
Nice catch.
> if ((kind & _odict_ITER_KEYS) && (kind &_odict_ITER_VALUES))
You can reduce one branch by
```
#define _odict_ITER_ITEMS (_odict_ITER_KEYS|_odict_ITER_VALUES)
...
if (kind & _odict_ITER_ITEMS == _odict_ITER_ITEM
Inada Naoki added the comment:
That's too bad.
We can not compare two Unicode by pointer even if both are interned anymore...
It was a nice optimization.
--
___
Python tracker
<https://bugs.python.org/is
Inada Naoki added the comment:
Should `_PyUnicode_EqualToASCIIId()` support comparing two unicode from
different interpreter??
--
nosy: +methane
___
Python tracker
<https://bugs.python.org/issue46
Change by Inada Naoki :
--
versions: +Python 3.11 -Python 3.10, Python 3.8, Python 3.9
___
Python tracker
<https://bugs.python.org/issue23882>
___
___
Python-bug
Change by Inada Naoki :
--
pull_requests: +27982
stage: needs patch -> patch review
pull_request: https://github.com/python/cpython/pull/29745
___
Python tracker
<https://bugs.python.org/issu
Inada Naoki added the comment:
The another error I found is already reported as #42868.
--
___
Python tracker
<https://bugs.python.org/issue38625>
___
___
Pytho
Change by Inada Naoki :
--
resolution: -> fixed
stage: -> resolved
status: open -> closed
versions: +Python 3.8, Python 3.9
___
Python tracker
<https://bugs.python.or
Inada Naoki added the comment:
I confirmed that this bug is fixed, but I found another error.
--
___
Python tracker
<https://bugs.python.org/issue38625>
___
___
Inada Naoki added the comment:
Is this bug fixed by #26730?
--
___
Python tracker
<https://bugs.python.org/issue38625>
___
___
Python-bugs-list mailin
Inada Naoki added the comment:
When I am trying to understand this issue, I see this segfault.
https://gist.github.com/methane/1b83e2abc6739017e0490c5f70a27b52
I am not sure this segfault is caused by this issue or not. If this is
unrelated, I will create another issue
Change by Inada Naoki :
--
resolution: -> fixed
stage: patch review -> resolved
status: open -> closed
___
Python tracker
<https://bugs.python.or
Inada Naoki added the comment:
New changeset 0a4c82ddd34a3578684b45b76f49cd289a08740b by Inada Naoki in branch
'main':
bpo-45475: Revert `__iter__` optimization for GzipFile, BZ2File, and LZMAFile.
(GH-29016)
https://github.com/python/cpython/commit/0a4c82ddd34a3578684b45b76f49cd
1 - 100 of 3039 matches
Mail list logo