date:20160808

Re: [Python-Dev] C99

2016-08-08 Thread Stephen J. Turnbull

Ned Deily writes:

 > But the point I was trying to make is that, by changing the
 > language requirement, we will likely have an effect on what
 > platforms (across the board) and versions we support and we should
 > take that into account when making this decision.  It may be the
 > right decision, in balance, to drop support for some of these but
 > we shouldn't do it by accident.

Sure, you were clear enough about that.  My point was simply that at
least for older Macs it probably is not that big a problem (but I do
have a Panther still running, at least as of March it did ;-).

Similarly, for platforms where we build with GCC, many of these
features have been available for a long time with switches.

Adding it all up, we don't want to break anybody inadvertantly and we
should take care to fix what breakage we can in advance, but I think
it's time to allow at least some of these features, and maybe move to
C99 wholesale.

Steve
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] Review request: issue 10910, pyport.h causes trouble for C++ extensions on BSDs

2016-08-08 Thread tdsmith

Hi python-dev! I'm a maintainer for Homebrew, a third-party package
manager for OS X, where I'm the resident parseltongue.

Issue 10910 is related to problems building CPython extension modules
with C++ code on BSDs. As I understand it, pyport.h has some code
specific to BSD and friends, including OS X, which causes conflicts
with the C++ stdlib.

We've been carrying the patch Ronald Oussoren wrote in 2011 [2]
against Python 2.7 since olden times. We were recently prompted to add
the patch to our 3.5 package as well [3] because the bug was causing
build problems in the wild. [4]

We strive to apply as few patches as possible in Homebrew and we (I)
would love to see a fix for this deployed upstream. Can I do anything
to help code get checked in?

Thanks,
Tim

[1] https://bugs.python.org/issue10910
[2] https://bugs.python.org/issue10910#msg135414
[3] https://github.com/Homebrew/homebrew-core/pull/3396
[4] https://github.com/IntelPNI/brainiak/pull/82

-- 
Tim Smith
Freenode: tdsmith, #machomebrew
https://tds.xyz, https://github.com/tdsmith
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] [python-committers] Failed to build select

2016-08-08 Thread Ned Deily

On Aug 8, 2016, at 02:45, Steven D'Aprano  wrote:
> On Mon, Aug 08, 2016 at 12:17:21AM -0400, Ned Deily wrote:
>> Also, try without setting PYTHONHOME.  I'm not sure what you're trying to do 
>> by setting that but you shouldn't need to.
> I didn't think I needed to either, but when I try without it, I get:
> 
> Could not find platform dependent libraries 
> Consider setting $PYTHONHOME to [:]
> Could not find platform dependent libraries 
> Consider setting $PYTHONHOME to [:]



On Aug 8, 2016, at 03:25, Chris Jerdonek  wrote:
> FWIW, I would be interested in learning more about the above warning
> (its significance, how it can be addressed, whether it can be ignored,
> etc). I also get this message when installing 3.5.2 from source on
> Ubuntu 14.04.

Those messages are harmless and are generated by the Makefile steps that update 
Importlib's bootstrap files, Python/importlib_external.h and 
Python/importlib_external.h.  See http://bugs.python.org/issue14928 for the 
origins of this.  It should be possible to fix the Makefile to suppress those 
messages.  I suggest you open an issue about it.

--
  Ned Deily
  [email protected] -- []

___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Review request: issue 10910, pyport.h causes trouble for C++ extensions on BSDs

2016-08-08 Thread Brett Cannon

On Mon, 8 Aug 2016 at 08:10 tdsmith  wrote:

> Hi python-dev! I'm a maintainer for Homebrew, a third-party package
> manager for OS X, where I'm the resident parseltongue.
>
> Issue 10910 is related to problems building CPython extension modules
> with C++ code on BSDs. As I understand it, pyport.h has some code
> specific to BSD and friends, including OS X, which causes conflicts
> with the C++ stdlib.
>
> We've been carrying the patch Ronald Oussoren wrote in 2011 [2]
> against Python 2.7 since olden times. We were recently prompted to add
> the patch to our 3.5 package as well [3] because the bug was causing
> build problems in the wild. [4]
>
> We strive to apply as few patches as possible in Homebrew and we (I)
> would love to see a fix for this deployed upstream. Can I do anything
> to help code get checked in?
>

The trick is someone feeling up to the task of knowing enough C, C++, and
what's happening on OS X/BSD to validate the patch and apply it. Usually
that's Ronald or Ned and Ronald never applied his patch, so I guess that
leaves Ned. :)

If Ned doesn't have the time to look then just ping the issue in a week and
I will apply it since both you and FreeBSD are already carrying the patch
forward.
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] [python-committers] Failed to build select

2016-08-08 Thread Chris Jerdonek

On Mon, Aug 8, 2016 at 8:59 AM, Ned Deily  wrote:
> On Aug 8, 2016, at 02:45, Steven D'Aprano  wrote:
>>
>> Could not find platform dependent libraries 
>> Consider setting $PYTHONHOME to [:]
>> Could not find platform dependent libraries 
>> Consider setting $PYTHONHOME to [:]
>
> On Aug 8, 2016, at 03:25, Chris Jerdonek  wrote:
>> FWIW, I would be interested in learning more about the above warning
>> (its significance, how it can be addressed, whether it can be ignored,
>> etc). I also get this message when installing 3.5.2 from source on
>> Ubuntu 14.04.
>
> Those messages are harmless and are generated by the Makefile steps that 
> update Importlib's bootstrap files, Python/importlib_external.h and 
> Python/importlib_external.h.  See http://bugs.python.org/issue14928 for the 
> origins of this.  It should be possible to fix the Makefile to suppress those 
> messages.  I suggest you open an issue about it.

I created an issue for this here: http://bugs.python.org/issue27713

--Chris
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] Rewrite @contextlib.contextmanager in C

2016-08-08 Thread Giampaolo Rodola'

import timeit
import contextlib

@contextlib.contextmanager
def ctx1():
yield

class ctx2:
def __enter__(self):
pass
def __exit__(self, *args):
pass

t1 = timeit.timeit("with ctx1(): pass", setup="from __main__ import ctx1")
t2 = timeit.timeit("with ctx2(): pass", setup="from __main__ import ctx2")
print("%.3f secs" % t1)
print("%.3f secs" % t2)
print("slowdown: -%.2fx" % (t1 / t2))


...with Python 3.5:

1.938 secs
0.443 secs
slowdown: -4.37x

I wanted to give it a try rewriting this in C but since @contextmanager has
a lot of magic I wanted to ask first whether this 1) is technically
possible 2) is desirable.
Thoughts?

-- 
Giampaolo - http://grodola.blogspot.com
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Rewrite @contextlib.contextmanager in C

2016-08-08 Thread Yury Selivanov




On 2016-08-08 3:33 PM, Giampaolo Rodola' wrote:
I wanted to give it a try rewriting this in C but since 
@contextmanager has a lot of magic I wanted to ask first whether this 
1) is technically possible 2) is desirable.


It is definitely technologically possible.  However, the C 
implementation will be quite complex, and will require a lot of time to 
review and later maintain.  My question would be how critical is the 
performance of @contextmanager?  I'd say that unless it's used in a 
tight loop it can't affect the performance too much.


Yury
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Rewrite @contextlib.contextmanager in C

2016-08-08 Thread Guido van Rossum

I think Nick would be interested in understanding why this is the case.
What does the decorator do that could be so expensive?

On Mon, Aug 8, 2016 at 1:07 PM, Yury Selivanov 
wrote:

>
>
> On 2016-08-08 3:33 PM, Giampaolo Rodola' wrote:
>
>> I wanted to give it a try rewriting this in C but since @contextmanager
>> has a lot of magic I wanted to ask first whether this 1) is technically
>> possible 2) is desirable.
>>
>
> It is definitely technologically possible.  However, the C implementation
> will be quite complex, and will require a lot of time to review and later
> maintain.  My question would be how critical is the performance of
> @contextmanager?  I'd say that unless it's used in a tight loop it can't
> affect the performance too much.
>
> Yury
> ___
> Python-Dev mailing list
> [email protected]
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: https://mail.python.org/mailman/options/python-dev/guido%
> 40python.org
>



-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Rewrite @contextlib.contextmanager in C

2016-08-08 Thread Giampaolo Rodola'

On Mon, Aug 8, 2016 at 10:07 PM, Yury Selivanov 
wrote:

>
>
> On 2016-08-08 3:33 PM, Giampaolo Rodola' wrote:
>
>> I wanted to give it a try rewriting this in C but since @contextmanager
>> has a lot of magic I wanted to ask first whether this 1) is technically
>> possible 2) is desirable.
>>
>
> It is definitely technologically possible.  However, the C implementation
> will be quite complex, and will require a lot of time to review and later
> maintain.  My question would be how critical is the performance of
> @contextmanager?  I'd say that unless it's used in a tight loop it can't
> affect the performance too much.
>

Yeah, the (my) use case is exactly that (tight loops).


-- 
Giampaolo - http://grodola.blogspot.com
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Rewrite @contextlib.contextmanager in C

2016-08-08 Thread Yury Selivanov




On 2016-08-08 4:18 PM, Guido van Rossum wrote:
I think Nick would be interested in understanding why this is the 
case. What does the decorator do that could be so expensive?


From the looks of it it doesn't do anything special.  Although with 
@contextlib.contextmanager we have to instantiate a generator (the 
decorated one) and advance it in __enter__.  So it's an extra object 
instantiation + extra code in __enter__ and __exit__.  Anyways, Nick 
knows much more about that code.


Giampaolo, before experimenting with a C implementation, I suggest you 
to try to compile contextlib.py with Cython.  I'll be surprised if you 
can make it more than 30-40% faster.  And you won't get much faster than 
Cython when you code contextmanager in C by hand.


Also, we don't have slots for __enter__ and __exit__, so there is no way 
to avoid the attribute lookup.


Yury
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Rewrite @contextlib.contextmanager in C

2016-08-08 Thread Wolfgang Maier


On 8/8/2016 22:38, Yury Selivanov wrote:



On 2016-08-08 4:18 PM, Guido van Rossum wrote:

I think Nick would be interested in understanding why this is the
case. What does the decorator do that could be so expensive?


From the looks of it it doesn't do anything special.  Although with
@contextlib.contextmanager we have to instantiate a generator (the
decorated one) and advance it in __enter__.  So it's an extra object
instantiation + extra code in __enter__ and __exit__.  Anyways, Nick
knows much more about that code.



Right, I think a fairer comparison would be to:

class ctx2:
def __enter__(self):
self.it = iter(self)
return next(self.it)

def __exit__(self, *args):
try:
next(self.it)
except StopIteration:
pass

def __iter__(self):
yield

With this change alone the slowdown diminishes to ~ 1.7x for me. The 
rest is probably the extra overhead for being able to pass exceptions 
raised inside the with block back into the generator and such.



___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Rewrite @contextlib.contextmanager in C

2016-08-08 Thread Chris Angelico

On Tue, Aug 9, 2016 at 7:14 AM, Wolfgang Maier
 wrote:
> Right, I think a fairer comparison would be to:
>
> class ctx2:
> def __enter__(self):
> self.it = iter(self)
> return next(self.it)
>
> def __exit__(self, *args):
> try:
> next(self.it)
> except StopIteration:
> pass
>
> def __iter__(self):
> yield
>
> With this change alone the slowdown diminishes to ~ 1.7x for me. The rest is
> probably the extra overhead for being able to pass exceptions raised inside
> the with block back into the generator and such.

I played around with a few other variants to see where the slowdown
is. They all work out pretty much the same as the above; my two
examples are both used the same way as contextlib.contextmanager is,
but are restrictive on what you can do.

import timeit
import contextlib
import functools

class ctx1:
def __enter__(self):
pass
def __exit__(self, *args):
pass

@contextlib.contextmanager
def ctx2():
yield

class SimplerContextManager:
"""Like contextlib._GeneratorContextManager but way simpler.

* Doesn't reinstantiate itself - just reinvokes the generator
* Doesn't allow yielded objects (returns self)
* Lacks a lot of error checking. USE ONLY AS DIRECTED.
"""
def __init__(self, func):
self.func = func
functools.update_wrapper(self, func)
def __call__(self, *a, **kw):
self.gen = self.func(*a, **kw)
return self
def __enter__(self):
next(self.gen)
return self
def __exit__(self, type, value, traceback):
if type is None:
try: next(self.gen)
except StopIteration: return
else: raise RuntimeError("generator didn't stop")
try: self.gen.throw(type, value, traceback)
except StopIteration: return True
# Assume any instance of the same exception type is a proper reraise
# This is way simpler than contextmanager normally does, and costs us
# the ability to detect exception handlers that coincidentally raise
# the same type of error (eg "except ZeroDivisionError: print(1/0)").
except type: return False

# Check that it actually behaves correctly
@SimplerContextManager
def ctxdemo():
print("Before yield")
try:
yield 123
except ZeroDivisionError:
print("Absorbing 1/0")
return
finally:
print("Finalizing")
print("After yield (no exception)")

with ctxdemo() as val:
print("1/0 =", 1/0)
with ctxdemo() as val:
print("1/1 =", 1/1)
#with ctxdemo() as val:
#print("1/q =", 1/q)

@SimplerContextManager
def ctx3():
yield

class TooSimpleContextManager:
"""Now this time you've gone too far."""
def __init__(self, func):
self.func = func
def __call__(self):
self.gen = self.func()
return self
def __enter__(self):
next(self.gen)
def __exit__(self, type, value, traceback):
try: next(self.gen)
except StopIteration: pass

@TooSimpleContextManager
def ctx4():
yield

class ctx5:
def __enter__(self):
self.it = iter(self)
return next(self.it)

def __exit__(self, *args):
try:
next(self.it)
except StopIteration:
pass

def __iter__(self):
yield

t1 = timeit.timeit("with ctx1(): pass", setup="from __main__ import ctx1")
print("%.3f secs" % t1)

for i in range(2, 6):
t2 = timeit.timeit("with ctx2(): pass", setup="from __main__
import ctx%d as ctx2"%i)
print("%.3f secs" % t2)
print("slowdown: -%.2fx" % (t2 / t1))

My numbers are:

0.320 secs
1.354 secs
slowdown: -4.23x
0.899 secs
slowdown: -2.81x
0.831 secs
slowdown: -2.60x
0.868 secs
slowdown: -2.71x

So compared to the tight custom-written context manager class, all the
"pass it a generator function" varieties look pretty much the same.
The existing contextmanager factory has several levels of indirection,
and that's probably where the rest of the performance difference comes
from, but there is some cost to the simplicity of the gen-func
approach.

My guess is that a C-implemented version could replace piles of
error-handling code with simple pointer comparisons (hence my
elimination of it), and may or may not be able to remove some of the
indirection. I'd say it'd land about in the same range as the other
examples here. Is that worth it?

ChrisA
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] New calling convention to avoid temporarily tuples when calling functions

2016-08-08 Thread Victor Stinner

Hi,

tl;dr I found a way to make CPython 3.6 faster and I validated that
there is no performance regression. I'm requesting approval of core
developers to start pushing changes.

In 2014 during a lunch at Pycon, Larry Hasting told me that he would
like to get rid of temporary tuples to call functions in Python. In
Python, positional arguments are passed as a tuple to C functions:
"PyObject *args". Larry wrote Argument Clinic which gives more control
on how C functions are called. But I guess that Larry didn't have time
to finish his implementation, since he didn't publish a patch.

While trying to optimize CPython 3.6, I wrote a proof-of-concept patch
and results were promising:
https://bugs.python.org/issue26814#msg264003
https://bugs.python.org/issue26814#msg266359

C functions get a C array "PyObject **args, int nargs". Getting the
nth argument become "arg = args[n];" at the C level. This format is
not new, it's already used internally in Python/ceval.c. A Python
function call made from a Python function already avoids a temporary
tuple in most cases: we pass the stack of the first function as the
list of arguments to the second function. My patch generalizes the
idea to C functions. It works in all directions (C=>Python, Python=>C,
C=>C, etc.).

Many function calls become faster than Python 3.5 with my full patch,
but even faster than Python 2.7! For multiple reasons (not interesting
here), tested functions are slower in Python 3.4 than Python 2.7.
Python 3.5 is better than Python 3.4, but still slower than Python 2.7
in a few cases. Using my "FASTCALL" patch, all tested function calls
become faster or as fast as Python 2.7!

But when I ran the CPython benchmark suite, I found some major
performance regressions. In fact, it took me 3 months to understand
that I didn't run benchmarks correctly and that most benchmarks of the
CPython benchmark suite are very unstable. I wrote articles explaining
how benchmarks should be run (to be stable) and I patched all
benchmarks to use my new perf module which runs benchmarks in multiple
processes and computes the average (to make benchmarks more stable).

At the end, my minimum FASTCALL patch (issue #27128) doesn't show any
major performance regression if you run "correctly" benchmarks :-)
https://bugs.python.org/issue27128#msg272197

Most benchmarks are not significant, 14 are faster, and only 4 are slower.

According to benchmarks on the "full" FASTCALL patch, the slowdown are
temporary and should become quickly speedup (with further changes).

My question is now: can I push fastcall-2.patch of the issue #27128?
This patch only adds the infrastructure to start working on more
useful optimizations, more patches will come, I expect more exciting
benchmark results.

Overview of the initial FASTCALL patch, see my first message on the issue:
https://bugs.python.org/issue27128#msg266422

--

Note: My full FASTCALL patch changes the C API: this is out of the
scope of my first simple FASTCALL patch. I will open a new discussion
to decide if it's worth it and if yes, how it should be done.

Victor
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] New calling convention to avoid temporarily tuples when calling functions

2016-08-08 Thread Guido van Rossum

On Mon, Aug 8, 2016 at 3:25 PM, Victor Stinner 
wrote:

> tl;dr I found a way to make CPython 3.6 faster and I validated that
> there is no performance regression.

But is there a performance improvement?

> I'm requesting approval of core
> developers to start pushing changes.
>
> In 2014 during a lunch at Pycon, Larry Hasting told me that he would
> like to get rid of temporary tuples to call functions in Python. In
> Python, positional arguments are passed as a tuple to C functions:
> "PyObject *args". Larry wrote Argument Clinic which gives more control
> on how C functions are called. But I guess that Larry didn't have time
> to finish his implementation, since he didn't publish a patch.
>

Hm, I agree that those tuples are probably expensive. I recall that
IronPython boasted faster Python calls by doing something closer to the
platform (in their case I'm guessing C# or the CLR :-).

Is this perhaps something that could wait until the Core devs sprint in a
few weeks? (I presume you're coming?!)

-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] New calling convention to avoid temporarily tuples when calling functions

2016-08-08 Thread Victor Stinner

2016-08-09 0:40 GMT+02:00 Guido van Rossum :
>> tl;dr I found a way to make CPython 3.6 faster and I validated that
>> there is no performance regression.
>
> But is there a performance improvement?

Sure.


On micro-benchmarks, you can see nice improvements:

* getattr(1, "real") becomes 44% faster
* list(filter(lambda x: x, list(range(1000 becomes 31% faster
* namedtuple.attr becomes -23% faster
* etc.

See https://bugs.python.org/issue26814#msg263999 for default => patch,
or https://bugs.python.org/issue26814#msg264003 for comparison python
2.7 / 3.4 / 3.5 / 3.6 / 3.6 patched.


On the CPython benchmark suite, I also saw many faster benchmarks:

Faster (25):
- pickle_list: 1.29x faster
- etree_generate: 1.22x faster
- pickle_dict: 1.19x faster
- etree_process: 1.16x faster
- mako_v2: 1.13x faster
- telco: 1.09x faster
- raytrace: 1.08x faster
- etree_iterparse: 1.08x faster
(...)

See https://bugs.python.org/issue26814#msg266359

Victor
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] New calling convention to avoid temporarily tuples when calling functions

2016-08-08 Thread Victor Stinner

2016-08-09 0:40 GMT+02:00 Guido van Rossum :
> Hm, I agree that those tuples are probably expensive. I recall that
> IronPython boasted faster Python calls by doing something closer to the
> platform (in their case I'm guessing C# or the CLR :-).

To be honest, I didn't expect *any* speedup just by avoiding the
temporary tuples. The C structore of tuples is simple and the
allocation of tuples is already optimized by a free list. I still
don't understand how the cost of tuple creation/destruction can have
such "large" impact on performances.

The discussion with Larry was not really my first motivation to work
on FASTCALL.

I worked on this topic because CPython already uses some "hidden"
tuples to avoid the cost of the tuple creation/destruction in various
places, but using really ugly code and this ugly code caused crashes
and surprising behaviours...

https://bugs.python.org/issue26811 is a recent crash related to
property_descr_get() optimization, whereas the optimization was
already "fixed" once:
https://hg.python.org/cpython/rev/5dbf3d932a59/

The fix is just another hack on top of the existing hack. The issue
#26811 rewrote the optimization to avoid the crash using
_PyObject_GC_UNTRACK():
https://hg.python.org/cpython/rev/a98ef122d73d

I tried to make this "optimization" the standard way to call
functions, rather than a corner case, and avoid hacks like
PyTuple_SET_ITEM(args, 0, NULL) or _PyObject_GC_UNTRACK().

Victor
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] New calling convention to avoid temporarily tuples when calling functions

2016-08-08 Thread Brett Cannon

I just wanted to say I'm excited about this and I'm glad someone is taking
advantage of what Argument Clinic allows for and what I know Larry had
initially hoped AC would make happen!

I should also point out that Serhiy has a patch for faster keyword argument
parsing thanks to AC: http://bugs.python.org/issue27574 . Not sure how your
two patches would intertwine (if at all).

On Mon, 8 Aug 2016 at 15:26 Victor Stinner  wrote:

> Hi,
>
> tl;dr I found a way to make CPython 3.6 faster and I validated that
> there is no performance regression. I'm requesting approval of core
> developers to start pushing changes.
>
> In 2014 during a lunch at Pycon, Larry Hasting told me that he would
> like to get rid of temporary tuples to call functions in Python. In
> Python, positional arguments are passed as a tuple to C functions:
> "PyObject *args". Larry wrote Argument Clinic which gives more control
> on how C functions are called. But I guess that Larry didn't have time
> to finish his implementation, since he didn't publish a patch.
>
> While trying to optimize CPython 3.6, I wrote a proof-of-concept patch
> and results were promising:
> https://bugs.python.org/issue26814#msg264003
> https://bugs.python.org/issue26814#msg266359
>
> C functions get a C array "PyObject **args, int nargs". Getting the
> nth argument become "arg = args[n];" at the C level. This format is
> not new, it's already used internally in Python/ceval.c. A Python
> function call made from a Python function already avoids a temporary
> tuple in most cases: we pass the stack of the first function as the
> list of arguments to the second function. My patch generalizes the
> idea to C functions. It works in all directions (C=>Python, Python=>C,
> C=>C, etc.).
>
> Many function calls become faster than Python 3.5 with my full patch,
> but even faster than Python 2.7! For multiple reasons (not interesting
> here), tested functions are slower in Python 3.4 than Python 2.7.
> Python 3.5 is better than Python 3.4, but still slower than Python 2.7
> in a few cases. Using my "FASTCALL" patch, all tested function calls
> become faster or as fast as Python 2.7!
>
> But when I ran the CPython benchmark suite, I found some major
> performance regressions. In fact, it took me 3 months to understand
> that I didn't run benchmarks correctly and that most benchmarks of the
> CPython benchmark suite are very unstable. I wrote articles explaining
> how benchmarks should be run (to be stable) and I patched all
> benchmarks to use my new perf module which runs benchmarks in multiple
> processes and computes the average (to make benchmarks more stable).
>
> At the end, my minimum FASTCALL patch (issue #27128) doesn't show any
> major performance regression if you run "correctly" benchmarks :-)
> https://bugs.python.org/issue27128#msg272197
>
> Most benchmarks are not significant, 14 are faster, and only 4 are slower.
>
> According to benchmarks on the "full" FASTCALL patch, the slowdown are
> temporary and should become quickly speedup (with further changes).
>
> My question is now: can I push fastcall-2.patch of the issue #27128?
> This patch only adds the infrastructure to start working on more
> useful optimizations, more patches will come, I expect more exciting
> benchmark results.
>
> Overview of the initial FASTCALL patch, see my first message on the issue:
> https://bugs.python.org/issue27128#msg266422
>
> --
>
> Note: My full FASTCALL patch changes the C API: this is out of the
> scope of my first simple FASTCALL patch. I will open a new discussion
> to decide if it's worth it and if yes, how it should be done.
>
> Victor
> ___
> Python-Dev mailing list
> [email protected]
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/brett%40python.org
>
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] New calling convention to avoid temporarily tuples when calling functions

2016-08-08 Thread Victor Stinner

2016-08-09 1:36 GMT+02:00 Brett Cannon :
> I just wanted to say I'm excited about this and I'm glad someone is taking
> advantage of what Argument Clinic allows for and what I know Larry had
> initially hoped AC would make happen!

To make "Python" faster, not only a few specific functions, "all" C
code should be updated to use the new "FASTCALL" calling convention.
But it's a pain to have to rewrite the code parsing arguments, we all
hate having to put #ifdef in the code... (for backward compatibility.)

This is where the magic happens: if your code is written using
Argument Clinic, you will get the optimization (FASTCALL) for free:
just run again Argument Clinic to get the "updated" "calling
convention".

It can be a very good motivation to rewrite your code using Argument
Clinic: get better inline documentation (docstring, help(func) in
REPL) *and* performance ;-)


> I should also point out that Serhiy has a patch for faster keyword argument
> parsing thanks to AC: http://bugs.python.org/issue27574 . Not sure how your
> two patches would intertwine (if at all).

In a first implementation, I packed *all* arguments in the same C
array: positional and keyword arguments. The problem is that all
functions expect a dict to parse keyword arguments. A dict has an
important property: O(1) for lookup. It becomes O(n) if you pass
keyword arguments as a list of (key, value) tuples in a C array.

So I chose to don't touch keyword arguments at all: continue to pass
them as a dict.

By the way, it's very rare to call a function using keyword arguments from C.

--

About http://bugs.python.org/issue27574 : it's really nice to see work
done on this part!

I recall a discussion of the performance of operator versus function
call. In some cases, the overhead of "parsing" arguments is higher
than the cost of the same feature implemented as an operator! Hum, it
was probably this issue:
https://bugs.python.org/issue17170

Extract of the issue:
"""
Some crude C benchmarking on this computer:
- calling PyUnicode_Replace is 35 ns (per call)
- calling "hundred".replace is 125 ns
- calling PyArg_ParseTuple with the same signature as "hundred".replace is 80 ns
"""

Victor
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] New calling convention to avoid temporarily tuples when calling functions

2016-08-08 Thread Yury Selivanov


On 2016-08-08 6:53 PM, Victor Stinner wrote:


2016-08-09 0:40 GMT+02:00 Guido van Rossum :

tl;dr I found a way to make CPython 3.6 faster and I validated that
there is no performance regression.

But is there a performance improvement?

Sure.


On micro-benchmarks, you can see nice improvements:

* getattr(1, "real") becomes 44% faster
* list(filter(lambda x: x, list(range(1000 becomes 31% faster
* namedtuple.attr becomes -23% faster
* etc.

See https://bugs.python.org/issue26814#msg263999 for default => patch,
or https://bugs.python.org/issue26814#msg264003 for comparison python
2.7 / 3.4 / 3.5 / 3.6 / 3.6 patched.


On the CPython benchmark suite, I also saw many faster benchmarks:

Faster (25):
- pickle_list: 1.29x faster
- etree_generate: 1.22x faster
- pickle_dict: 1.19x faster
- etree_process: 1.16x faster
- mako_v2: 1.13x faster
- telco: 1.09x faster
- raytrace: 1.08x faster
- etree_iterparse: 1.08x faster
(...)



Exceptional results, congrats Victor. Will be happy to help with code 
review.


Yury
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] C99

[Python-Dev] Review request: issue 10910, pyport.h causes trouble for C++ extensions on BSDs

Re: [Python-Dev] [python-committers] Failed to build select

Re: [Python-Dev] Review request: issue 10910, pyport.h causes trouble for C++ extensions on BSDs

Re: [Python-Dev] [python-committers] Failed to build select

[Python-Dev] Rewrite @contextlib.contextmanager in C

Re: [Python-Dev] Rewrite @contextlib.contextmanager in C

Re: [Python-Dev] Rewrite @contextlib.contextmanager in C

Re: [Python-Dev] Rewrite @contextlib.contextmanager in C

Re: [Python-Dev] Rewrite @contextlib.contextmanager in C

Re: [Python-Dev] Rewrite @contextlib.contextmanager in C

Re: [Python-Dev] Rewrite @contextlib.contextmanager in C

[Python-Dev] New calling convention to avoid temporarily tuples when calling functions

Re: [Python-Dev] New calling convention to avoid temporarily tuples when calling functions

Re: [Python-Dev] New calling convention to avoid temporarily tuples when calling functions

Re: [Python-Dev] New calling convention to avoid temporarily tuples when calling functions

Re: [Python-Dev] New calling convention to avoid temporarily tuples when calling functions

Re: [Python-Dev] New calling convention to avoid temporarily tuples when calling functions

Re: [Python-Dev] New calling convention to avoid temporarily tuples when calling functions

19 matches

Site Navigation

Mail list logo

Footer information