[issue35636] remove redundant check in unicode_hash(PyObject *self)

2019-01-05 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: For historical reasons. In Python 2, str and unicode consisting of ASCII characters can be equal. Equal values should have the same hash. In Python 3, bytes and str are always different. This can cause subtle bugs in the code ported from Python 2. Options

[issue35636] remove redundant check in unicode_hash(PyObject *self)

2019-01-02 Thread Ma Lin
Ma Lin added the comment: One scene is caching regular expresses, b'[a-z]', '[a-z]' may exist in the same dict. Any way, it's trivial on the whole. -- ___ Python tracker ___

[issue35636] remove redundant check in unicode_hash(PyObject *self)

2019-01-02 Thread Stefan Behnel
Stefan Behnel added the comment: > maybe this can be changed in Python 4.0 Well, if you find a *very* good reason for changing it, as I said. Py4 won't be special in that regard, I suppose. -- ___ Python tracker

[issue35636] remove redundant check in unicode_hash(PyObject *self)

2019-01-02 Thread Ma Lin
Ma Lin added the comment: > I'd advise against changing the hash function without a very good reason. You > never know how much code relies on it in one way or another. ok, maybe this can be changed in Python 4.0 -- ___ Python tracker

[issue35636] remove redundant check in unicode_hash(PyObject *self)

2019-01-02 Thread Stefan Behnel
Stefan Behnel added the comment: > why bytes and str generates the same hash value for ASCII sequence Probably mostly for historical Py2 reasons. These days, both are somewhat unlikely to appear in the same dict. But still, I'd advise against changing the hash function without a very good re

[issue35636] remove redundant check in unicode_hash(PyObject *self)

2019-01-02 Thread Ma Lin
Ma Lin added the comment: Thanks for review. Don't know why bytes and str generates the same hash value for ASCII sequence. >>> hash('abc') == hash(b'abc') True This may brings some hash collisions, does it affect performance slightly? -- ___ Pyth

[issue35636] remove redundant check in unicode_hash(PyObject *self)

2019-01-02 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: New changeset a1d14253066f7dd60cfb465c6511fa565f312b42 by Serhiy Storchaka (animalize) in branch 'master': bpo-35636: Remove redundant check in unicode_hash(). (GH-11402) https://github.com/python/cpython/commit/a1d14253066f7dd60cfb465c6511fa565f312b42 --

[issue35636] remove redundant check in unicode_hash(PyObject *self)

2019-01-02 Thread Serhiy Storchaka
Change by Serhiy Storchaka : -- pull_requests: -10785 ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe: https:

[issue35636] remove redundant check in unicode_hash(PyObject *self)

2019-01-02 Thread Serhiy Storchaka
Change by Serhiy Storchaka : -- pull_requests: -10784 ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe: https:

[issue35636] remove redundant check in unicode_hash(PyObject *self)

2019-01-02 Thread Serhiy Storchaka
Change by Serhiy Storchaka : -- versions: -Python 3.6, Python 3.7 ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscr

[issue35636] remove redundant check in unicode_hash(PyObject *self)

2019-01-02 Thread Stefan Behnel
Stefan Behnel added the comment: Unlikely to get changed in Py3.4/5 anymore, since this is not even a bug fix. I wouldn't even fight for backporting, although 3.7 seems ok for it. I agree that this code duplication is worth removing. I don't consider hashing the empty string important enough

[issue35636] remove redundant check in unicode_hash(PyObject *self)

2019-01-02 Thread Ma Lin
Ma Lin added the comment: This redundant exists since Python 3.4 or earlier. -- title: remove redundant code in unicode_hash(PyObject *self) -> remove redundant check in unicode_hash(PyObject *self) type: enhancement -> performance versions: +Python 3.4, Python 3.5 __