Serhiy Storchaka <storch...@gmail.com> added the comment:
> Here is a new patch using _PyUnicodeWriter directly in longobject.c.
It may be worth to do it in a separate issue?
decimal digits) is 17% faster with my patch version 2 compared to tip,
and 38% faster compared to Python 3.3 before my optimizations on str%
tuples or str.format(). Creating a temporary PyUnicode is not cheap, at
least for short strings.
Here is my benchmark script (attached) and the results:
Python 3.3 (vanilla):
1.65 str(12345)
2.6 '{}'.format(12345)
2.69 'A{}'.format(12345)
3.16 '\x80{}'.format(12345)
3.23 '\u0100{}'.format(12345)
3.32 '\U00010000{}'.format(12345)
4.6 '{:-10}'.format(12345)
4.89 'A{:-10}'.format(12345)
5.53 '\x80{:-10}'.format(12345)
5.71 '\u0100{:-10}'.format(12345)
5.63 '\U00010000{:-10}'.format(12345)
4.6 '{:,}'.format(12345)
4.71 'A{:,}'.format(12345)
5.28 '\x80{:,}'.format(12345)
5.65 '\u0100{:,}'.format(12345)
5.59 '\U00010000{:,}'.format(12345)
Python 3.3 + format_writer.patch:
1.72 str(12345)
2.74 '{}'.format(12345)
2.99 'A{}'.format(12345)
3.4 '\x80{}'.format(12345)
3.52 '\u0100{}'.format(12345)
3.51 '\U00010000{}'.format(12345)
4.24 '{:-10}'.format(12345)
4.6 'A{:-10}'.format(12345)
5.16 '\x80{:-10}'.format(12345)
6.87 '\u0100{:-10}'.format(12345)
6.83 '\U00010000{:-10}'.format(12345)
4.12 '{:,}'.format(12345)
4.6 'A{:,}'.format(12345)
5.09 '\x80{:,}'.format(12345)
6.63 '\u0100{:,}'.format(12345)
6.42 '\U00010000{:,}'.format(12345)
Python 3.3 + format_writer-2.patch:
1.91 str(12345)
2.44 '{}'.format(12345)
2.61 'A{}'.format(12345)
3.08 '\x80{}'.format(12345)
3.31 '\u0100{}'.format(12345)
3.13 '\U00010000{}'.format(12345)
4.57 '{:-10}'.format(12345)
4.96 'A{:-10}'.format(12345)
5.52 '\x80{:-10}'.format(12345)
7.01 '\u0100{:-10}'.format(12345)
7.34 '\U00010000{:-10}'.format(12345)
4.42 '{:,}'.format(12345)
4.76 'A{:,}'.format(12345)
5.16 '\x80{:,}'.format(12345)
7.2 '\u0100{:,}'.format(12345)
6.74 '\U00010000{:,}'.format(12345)
As you can see, there is a regress, and sometimes it is not less than
improvement.
----------
Added file: http://bugs.python.org/file25506/issue14744-bench-1.py
_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue14744>
_______________________________________
import timeit
def bench(stmt, msg=None, number=100000):
if msg is None: msg = stmt
best = min(timeit.repeat(stmt, number=number))
print("%.3g\t%s" % (best * 1e6 / number, msg))
bench('str(12345)')
for s in ('', 'A', '\u0080', '\u0100', '\U00010000'):
bench('%a.format(12345)' % (s + '{}',))
for s in ('', 'A', '\u0080', '\u0100', '\U00010000'):
bench('%a.format(12345)' % (s + '{:-10}',))
for s in ('', 'A', '\u0080', '\u0100', '\U00010000'):
bench('%a.format(12345)' % (s + '{:,}',))
_______________________________________________
Python-bugs-list mailing list
Unsubscribe:
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com