New submission from STINNER Victor:

In Python 3.4, str==str is implemented by calling memcmp().

unicode_eq() function, used by dict and set types, checks the first byte before 
calling memcmp(). bytes==bytes uses the same check.

Py_UNICODE_MATCH macro checks the first *and* last character before calling 
memcmp() since this commit:
---
changeset:   38242:0de9a789de39
branch:      legacy-trunk
user:        Fredrik Lundh <fred...@pythonware.com>
date:        Tue May 23 10:10:57 2006 +0000
files:       Include/unicodeobject.h
description:
needforspeed: check first *and* last character before doing a full memcmp
---

Attached patch changes str==str to check the first and last character before 
calling memcmp(). It might reduce the overhead of a C function call, but it is 
much faster when comparing two different strings of the same length with a 
common prefix (but a different suffix).

The patch merges also unicode_compare_eq() and unicode_eq() to use the same 
code for str, dict and set.

We may use the same optimization on byte strings.

See also #16321.

----------
files: unicode_eq.patch
keywords: patch
messages: 185956
nosy: haypo, pitrou, serhiy.storchaka
priority: normal
severity: normal
status: open
title: str==str: compare the first and last character before calling memcmp()
versions: Python 3.4
Added file: http://bugs.python.org/file29668/unicode_eq.patch

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue17628>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to