Re: [Python-Dev] tp_finalize vs tp_del sematics

2015-08-24 Thread Valentine Sinitsyn

Hi Armin,

Thanks for replying.

On 23.08.2015 17:14, Armin Rigo wrote:

Hi Valentine,

On 19 August 2015 at 09:53, Valentine Sinitsyn
 wrote:

why it wasn't possible to
implement proposed CI disposal scheme on top of tp_del?


I'm replying here as best as I understand the situation, which might
be incomplete or wrong.

 From the point of view of someone writing a C extension module, both
tp_del and tp_finalize are called with the same guarantee that the
object is still valid at that point.  The difference is only that the
presence of tp_del prevents the object from being collected at all if
it is part of a cycle.  Maybe the same could have been done without
duplicating the function pointer (tp_del + tp_finalize) with a
Py_TPFLAGS_DEL_EVEN_IN_A_CYCLE.
So you mean that this was to keep things backwards compatible for 
third-party extensions? I haven't thought about it this way, but this 
makes sense. However, the behavior of Python code using objects with 
__del__ has changed nevertheless: they are collectible now, and __del__ 
is always called exactly once, if I understand everything correctly.


Thanks,
Valentine

___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Profile Guided Optimization active by-default

2015-08-24 Thread Gregory P. Smith
On Sat, Aug 22, 2015 at 9:27 AM Brett Cannon  wrote:

> On Sat, Aug 22, 2015, 09:17 Guido van Rossum  wrote:
>
> How about we first add a new Makefile target that enables PGO, without
> turning it on by default? Then later we can enable it by default.
>
>
There already is one and has been for many years.  make profile-opt.

I even setup a buildbot for it last year.

The problem with the existing profile-opt build in our default Makefile.in
is that is uses a horrible profiling workload (pybench, ugh) so it leaves a
lot of improvements behind.

What all Linux distros (Debian/Ubuntu and Redhat at least; nothing else
matters) do for their Python builds is to use profile-opt but they replace
the profiling workload with a stable set of the Python unittest suite
itself. Results are much better all around.  Generally a 20% speedup.

Anyone deploying Python who is *not* using a profile-opt build is wasting
CPU resources.

Whether it should be *the default* or not *is a different question*.  The
Makefile is optimized for CPython developers who certainly do not want to
run two separate builds and a profile-opt workload every time they type
make to test out their changes.

But all binary release builds should use it.

I agree. Updating the Makefile so it's easier to use PGO is great, but we
> should do a release with it as opt-in and go from there.
>
> Also, I have my doubts about regrtest. How sure are we that it represents
> a typical Python load? Tests are often using a different mix of operations
> than production code.
>
> That was also my question. You said that "it provides the best performance
> improvement", but compared to what; what else was tried? And what
> difference does it make to e.g. a Django app that is trained on their own
> simulated workload compared to using regrtest? IOW is regrtest displaying
> the best across-the-board performance because it stresses the largest swath
> of Python and thus catches generic patterns in the code but individuals
> could get better performance with a simulated workload?
>

This isn't something to argue about.  Just use regrtest and compare the
before and after with the benchmark suite.  It really does exercise things
well.  People like to fear that it'll produce code optimized for the test
suite itself or something.  No.  Python as an interpreter is very
realistically exercised by running it as it is simply running a lot of code
and a good variety of code including the extension modules that benefit
most such as regexes, pickle, json, xml, etc.

Thomas tried the test suite and a variety of other workloads when looking
at what to use at work.  The testsuite works out generally the best.  Going
beyond that seems to be a wash.

What we tested and decided to use on our own builds after benchmarking at
work was to build with:

make profile-opt PROFILE_TASK="-m test.regrtest -w -uall,-audio -x test_gdb
test_multiprocessing"

In general if a test is unreliable or takes an extremely long time, exclude
it for your sanity.  (i'd also kick out test_subprocess on 2.7; we replaced
subprocess with subprocess32 in our build so that wasn't an issue)

-gps
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] How are we merging forward from the Bitbucket 3.5 repo?

2015-08-24 Thread Larry Hastings

On 08/16/2015 08:24 AM, R. David Murray wrote:

On Sun, 16 Aug 2015 00:13:10 -0700, Larry Hastings  wrote:

Can we pick one approach and stick with it?  Pretty-please?

Pick one Larry, you are the RM :)


Okay.  Unsurprisingly, I pick what I called option 3 before.  It's 
basically what we do now when checking in work to 
earlier-version-branches, with the added complexity of the Bitbucket 
repo.  I just tried it and it seems fine.



Can you give us a step by
step like you did for creating the pull request?  Including how it
relates to the workflow for the other branches?

Also, on 08/17/2015 08:03 AM, Barry Warsaw wrote:

I agree with the "You're the RM, pick one" sentiment, but just want to add a
plea for *documenting* whatever you choose, preferably under a big red blinky
banner in the devguide. ;)   I can be a good monkey and follow directions, but
I just don't want to have to dig through long threads on semi-public mailing
lists to figure out which buttons to push.


I'll post a message describing the workflow to these two newsgroups 
(hopefully by today) and update the devguide (hopefully by tomorrow).  
There's no rush as I haven't accepted any pull requests recently, though 
I have a couple I should attend to.


(For those waiting on a reply on pull requests, sit tight, I want to get 
these workflow docs done first, that way you'll know what to do if/when 
your pull request is accepted.)


Thanks, everybody,


//arry//

/p.s. In case you're wondering, this RC period is way, way less stress 
than 3.4 was.  Part of that is the workflow change, and part of it is 
that there just isn't that much people are trying to get in this time.  
In 3.4 I think I had 70 merge requests just from Victor for asyncio...!
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Profile Guided Optimization active by-default

2015-08-24 Thread Matthias Klose
The current pgo target just uses a very specific task to train for the feedback.
For my Debian/Ubuntu builds I'm using the testsuite minus some problematic tests
to train. Otoh I don't know if this is the best way to do it, however it gave
better results at some time in the past.  What I would like is a benchmark / a
mixture of benchmarks on which to enable pgo/pdo. Based on that you could enable
pgo based on some static decisions based on autofdo. For that you don't need any
profile runs during your build; it just needs shipping the autofdo outcome
together with a Python release. This doesn't give you the same performance as
for for a GCC pgo build, but it would be a first step. And defining the probe
for any pgo build would be welcome too.

  Matthias


On 08/22/2015 06:25 PM, Brett Cannon wrote:
> On Sat, Aug 22, 2015, 09:17 Guido van Rossum  wrote:
> 
> How about we first add a new Makefile target that enables PGO, without
> turning it on by default? Then later we can enable it by default.
> 
> 
> I agree. Updating the Makefile so it's easier to use PGO is great, but we
> should do a release with it as opt-in and go from there.
> 
> Also, I have my doubts about regrtest. How sure are we that it represents a
> typical Python load? Tests are often using a different mix of operations
> than production code.
> 
> That was also my question. You said that "it provides the best performance
> improvement", but compared to what; what else was tried? And what
> difference does it make to e.g. a Django app that is trained on their own
> simulated workload compared to using regrtest? IOW is regrtest displaying
> the best across-the-board performance because it stresses the largest swath
> of Python and thus catches generic patterns in the code but individuals
> could get better performance with a simulated workload?
> 
> -Brett
> 
> 
> On Sat, Aug 22, 2015 at 7:46 AM, Patrascu, Alecsandru <
> [email protected]> wrote:
> 
> Hi All,
> 
> This is Alecsandru from Server Scripting Languages Optimization team at
> Intel Corporation.
> 
> I would like to submit a request to turn-on Profile Guided Optimization or
> PGO as the default build option for Python (both 2.7 and 3.6), given its
> performance benefits on a wide variety of workloads and hardware.  For
> instance, as shown from attached sample performance results from the Grand
> Unified Python Benchmark, >20% speed up was observed.  In addition, we are
> seeing 2-9% performance boost from OpenStack/Swift where more than 60% of
> the codes are in Python 2.7. Our analysis indicates the performance gain
> was mainly due to reduction of icache misses and CPU front-end stalls.
> 
> Attached is the Makefile patches that modify the all build target and adds
> a new one called "disable-profile-opt". We built and tested this patch for
> Python 2.7 and 3.6 on our Linux machines (CentOS 7/Ubuntu Server 14.04,
> Intel Xeon Haswell/Broadwell with 18/8 cores).  We use "regrtest" suite for
> training as it provides the best performance improvement.  Some of the test
> programs in the suite may fail which leads to build fail.  One solution is
> to disable the specific failed test using the "-x " flag (as shown in the
> patch)
> 
> Steps to apply the patch:
> 1.  hg clone https://hg.python.org/cpython cpython
> 2.  cd cpython
> 3.  hg update 2.7 (needed for 2.7 only)
> 4.  Copy *.patch to the current directory
> 5.  patch < python2.7-pgo.patch (or patch < python3.6-pgo.patch)
> 6.  ./configure
> 7.  make
> 
> To disable PGO
> 7b. make disable-profile-opt
> 
> In the following, please find our sample performance results from latest
> XEON machine, XEON Broadwell EP.
> Hardware (HW):  Intel XEON (Broadwell) 8 Cores
> 
> BIOS settings:  Intel Turbo Boost Technology: false
> Hyper-Threading: false
> 
> Operating System:   Ubuntu 14.04.3 LTS trusty
> 
> OS configuration:   CPU freq set at fixed: 2.6GHz by
> echo 260 >
> /sys/devices/system/cpu/cpu*/cpufreq/scaling_min_freq
> echo 260 >
> /sys/devices/system/cpu/cpu*/cpufreq/scaling_max_freq
> Address Space Layout Randomization (ASLR) disabled (to
> reduce run to run variation) by
> echo 0 > /proc/sys/kernel/randomize_va_space
> 
> GCC version:gcc version 4.8.4 (Ubuntu 4.8.4-2ubuntu1~14.04)
> 
> Benchmark:  Grand Unified Python Benchmark (GUPB)
> GUPB Source: https://hg.python.org/benchmarks/
> 
> Python2.7 results:
> Python source: hg clone https://hg.python.org/cpython cpython
> Python Source: hg update 2.7
> hg id: 0511b1165bb6 (2.7)
> hg id -r 'ancestors(.) and tag()': 15c95b7d81dc (2.7) v2.7.10
> hg --debug id -i: 0511b1165bb6cf40ada0768a7efc7ba89316f6a5
> 
> Benchmarks  Speedup(%)
> simple_logging  20
> raytrace20
> silent_logging  19
> richards19
> cha

Re: [Python-Dev] Profile Guided Optimization active by-default

2015-08-24 Thread Stewart, David C
(Sorry about the format here - I honestly just subscribed to Python-dev so
be gentle ...)

> Date: Sat, 22 Aug 2015 11:25:59 -0600
> From: Eric Snow 

>On Aug 22, 2015 9:02 AM, "Patrascu, Alecsandru" intel.com >
>wrote:[snip]> For instance, as shown from attached sample performance
>results from theGrand Unified Python Benchmark, >20% speed up was
>observed. 
>
>

Eric ­ I'm the manager of Intel's server scripting language optimization
team, so I'll answer from that perspective.

>Are you referring to the tests in the benchmarks repo? [1] How does the
>real-world performance improvement compare with otherlanguages you are
>targeting for optimization?

Yes, we're using [1].

We're seeing up to 10% improvement on Swift (a project in OpenStack) on
some architectures using the ssbench workload, which is as close to
real-world as we can get. Relative to other languages we target, this is
quite good actually. For example, Java's Hotspot JIT is driven by
profiling at its core so it's hard to distinguish the value profiling
alone brings. We have seen a nice boost on PHP running Wordpress using
PGO, but not as impressive as Python and Swift.

By the way, I think letting the compiler optimize the code is a good
strategy. Not the only strategy we want to use, but it seems like one we
could do more of.

> And thanks for working on this!  I have several more questions: What
>sorts of future changes in CPython's code might interfere with
>youroptimizations?
>
>

We're also looking at other source-level optimizations, like the CGOTO
patch Vamsi submitted in June. Some of these may reduce the value of PGO,
but in general it's nice to let the compiler do some optimization for you.

> What future additions might stand to benefit?
>

It's a good question. Our intent is to continue to evaluate and measure
different training workloads for improvement. In other words, as with any
good open source project, this patch should improve things a lot and
should be accepted upstream, but we will continue to make it better.

> What changes in existing code might improve optimization opportunities?
>
>

We intend to continue to work on source-level optimizations and measuring
them against GUPB and Swift.

> What is the added maintenance burden of the optimizations on CPython,
>ifany? 
>
>

I think the answer is none. Our goal was to introduce performance
improvements without adding to maintenance effort.

>What is the performance impact on non-Intel architectures?  What
>aboutolder Intel architectures?  ...and future ones?
>
>

We should modify the patch to make it for Intel only, since we're not
evaluating non-Intel architectures. Unfortunately for us, I suspect that
older Intel CPUs might benefit more than current and future ones. Future
architectures will benefit from other enabling work we're planning.

> What is Intel's commitment to supporting these (or other) optimizations
>inthe future?  How is the practical EOL of the optimizations managed?
>
>

As with any corporation's budgeting process, it's hard to know exactly
what my managers will let me spend money on. :-) But we're definitely
convinced of the value of dynamic languages for servers and the need to
work on optimization. As far as I have visibility, it appears to be
holding true.

> Finally, +1 on adding an opt-in Makefile target rather than enabling
>theoptimizations by default.
>
>

Frankly since Ubuntu has been running this way for past two years, I think
it's fine to make it opt-in, but eventually I hope it can be the default
once we're happy with it.

> Thanks again! -eric

___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Profile Guided Optimization active by-default

2015-08-24 Thread Nick Coghlan
On 25 August 2015 at 05:52, Gregory P. Smith  wrote:
> What we tested and decided to use on our own builds after benchmarking at
> work was to build with:
>
> make profile-opt PROFILE_TASK="-m test.regrtest -w -uall,-audio -x test_gdb
> test_multiprocessing"
>
> In general if a test is unreliable or takes an extremely long time, exclude
> it for your sanity.  (i'd also kick out test_subprocess on 2.7; we replaced
> subprocess with subprocess32 in our build so that wasn't an issue)

Having the "production ready" make target be "make profile-opt"
doesn't strike me as the most intuitive thing in the world.

I agree we want the "./configure && make" sequence to be oriented
towards local development builds rather than highly optimised
production ones, so perhaps we could provide a "make production"
target that enables PGO with an appropriate training set from
regrtest, and also complains if "--with-pydebug" is configured?

Regards,
Nick.

-- 
Nick Coghlan   |   [email protected]   |   Brisbane, Australia
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com