date:20161020

[Python-Dev] Benchmarking Python and micro-optimizations

2016-10-20 Thread Victor Stinner

Hi,

Last months, I worked a lot on benchmarks. I ran benchmarks, analyzed
results in depth (up to the hardware and kernel drivers!), I wrote new
tools and enhanced existing tools.

* I wrote a new perf module which runs benchmarks in a reliable way
and contains a LOT of features: collect metadata, JSON file format,
commands to compare, render an histogram, etc.

* I rewrote the Python benchmark suite: the old benchmarks Mercurial
repository moved to a new performance GitHub project which uses my
perf module and contains more benchmarks.

* I also made minor enhancements to timeit in Python 3.7 -- some dev
don't want major changes to not "break the backward compatibility".

For timeit, I suggest to use my perf tool which includes a reliable
timeit command and has much more features like --duplicate (repeat the
statements to reduce the cost of the outer loop) and --compare-to
(compare two versions of Python), but also all builtin perf features
(JSON output, statistics, histogram, etc.).

I added benchmarks from PyPy and Pyston benchmark suites to
performance: performance 0.3.1 contains 51 benchmark scripts which run
a total of 121 benchmarks. Example of tested Python modules:

* SQLAlchemy
* Dulwich (full Git implementation in Python)
* Mercurial (currently only the startup time)
* html5lib
* pyaes (AES crypto cipher in pure Python)
* sympy
* Tornado (HTTP client and server)
* Django (sadly, only the template engine right now, Pyston contains
HTTP benchmarks)
* pathlib
* spambayes

More benchmarks will be added later. It would be nice to add
benchmarks on numpy for example, numpy is important for a large part
of our community.

All these (new or updated) tools can now be used to take smarter
decisions on optimizations. Please don't push any optimization anymore
without providing reliable benchmark results!


My first major action was to close the latest attempt to
micro-optimize int+int in Python/ceval.c,
http://bugs.python.org/issue21955 : I closed the issue as rejected,
because there is no significant speedup on benchmarks other than two
(tiny) microbenchmarks. To make sure that no one looses its time on
trying to micro-optimize int+int, I even added a comment to
Python/ceval.c :-)

   https://hg.python.org/cpython/rev/61fcb12a9873
   "Please don't try to micro-optimize int+int"


The perf and performance are now well tested: Travis CI runs tests on
the new commits and pull requests, and the "tox" command can be used
locally to test different Python versions, pep8, doc, ... in a single
command.


Next steps:

* Run performance 0.3.1 on speed.python.org: the benchmark runner is
currently stopped (and still uses the old benchmarks project). The
website part may be updated to allow to download full JSON files which
includes *all* information (all timings, metadata and more).

* I plan to run performance on CPython 2.7, CPython 3.7, PyPy and PyPy
3. Maybe also CPython 3.5 and CPython 3.6 if they don't take too much
resources.

* Later, we can consider adding more implementations of Python:
Jython, IronPython, MicroPython, Pyston, Pyjion, etc. All benchmarks
should be run on the same hardware to be comparable.

* Later, we might also allow other projects to upload their own
benchmark results, but we should find a solution to groups benchmark
results per benchmark runner (ex: at least by the hostname, perf JSON
contains the hostname) to not compare two results from two different
hardware

* We should continue to add more benchmarks to the performance
benchmark suite, especially benchmarks more representative of real
applications (we have enough microbenchmarks!)


Links:

* perf: http://perf.readthedocs.io/
* performance: https://github.com/python/performance
* Python Speed mailing list: https://mail.python.org/mailman/listinfo/speed
* https://speed.python.org/ (currently outdated, and don't use performance yet)

See https://pypi.python.org/pypi/performance which contains even more
links to Python benchmarks (PyPy, Pyston, Numba, Pythran, etc.)

Victor
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] Have I got my hg dependencies correct?

2016-10-20 Thread Skip Montanaro

I've recently run into a problem building the math and cmath modules
for 2.7. (I don't rebuild very often, so this problem might have been
around for awhile.) My hg repos look like this:

* My cpython repo pulls from https://hg.python.org/cpython

* My 2.7 repo (and other non-tip repos) pulls from my cpython repo

I think this setup was recommended way back in the day when hg was new
to the Python toolchain to avoid unnecessary network bandwidth.

So, if I execute

hg pull
hg update

in first cpython, then 2.7 repos I should be up-to-date, correct?
However, rebuilding in my 2.7 repo fails to build math and cmath. The
compiler complains that Modules/_math.o doesn't exist. If I manually
execute

make Modules/_math.o
make

after the failure, then the math and cmath modules build.

Looking on bugs.python.org I saw this closed issue:

http://bugs.python.org/issue24421

which seems related. Is it possible that the fix wasn't propagated to
the 2.7 branch? Or perhaps I've fouled up my hg repo relationships? My
other repos which depend on cpython (3.5, 3.4, 3.3, and 3.2) all build
the math module just fine.

I'm running on an ancient MacBook Pro with OS X 10.11.6 (El Capitan)
and XCode 8.0 installed.

Any suggestions?

Skip
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Have I got my hg dependencies correct?

2016-10-20 Thread Skip Montanaro

On Thu, Oct 20, 2016 at 6:47 AM, Skip Montanaro
 wrote:
> Is it possible that the fix wasn't propagated to
> the 2.7 branch? Or perhaps I've fouled up my hg repo relationships?

Either way, I went ahead and opened a ticket:

http://bugs.python.org/issue28487

S
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Have I got my hg dependencies correct?

2016-10-20 Thread Victor Stinner

Are you on the 2.7 branch or the default branch?

You might try to cleanup your checkout:

hg up -C -r 2.7
make distclean
hg purge # WARNING! it removes *all* files not tracked by Mercurial
./configure && make

You should also paste the full error message.

Victor

2016-10-20 13:47 GMT+02:00 Skip Montanaro :
> I've recently run into a problem building the math and cmath modules
> for 2.7. (I don't rebuild very often, so this problem might have been
> around for awhile.) My hg repos look like this:
>
> * My cpython repo pulls from https://hg.python.org/cpython
>
> * My 2.7 repo (and other non-tip repos) pulls from my cpython repo
>
> I think this setup was recommended way back in the day when hg was new
> to the Python toolchain to avoid unnecessary network bandwidth.
>
> So, if I execute
>
> hg pull
> hg update
>
> in first cpython, then 2.7 repos I should be up-to-date, correct?
> However, rebuilding in my 2.7 repo fails to build math and cmath. The
> compiler complains that Modules/_math.o doesn't exist. If I manually
> execute
>
> make Modules/_math.o
> make
>
> after the failure, then the math and cmath modules build.
>
> Looking on bugs.python.org I saw this closed issue:
>
> http://bugs.python.org/issue24421
>
> which seems related. Is it possible that the fix wasn't propagated to
> the 2.7 branch? Or perhaps I've fouled up my hg repo relationships? My
> other repos which depend on cpython (3.5, 3.4, 3.3, and 3.2) all build
> the math module just fine.
>
> I'm running on an ancient MacBook Pro with OS X 10.11.6 (El Capitan)
> and XCode 8.0 installed.
>
> Any suggestions?
>
> Skip
> ___
> Python-Dev mailing list
> [email protected]
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: 
> https://mail.python.org/mailman/options/python-dev/victor.stinner%40gmail.com
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Benchmarking Python and micro-optimizations

2016-10-20 Thread Maciej Fijalkowski

Hi Victor

Despite the fact that I was not able to find time to run your stuff
yet, thanks for all the awesome work!

On Thu, Oct 20, 2016 at 12:56 PM, Victor Stinner
 wrote:
> Hi,
>
> Last months, I worked a lot on benchmarks. I ran benchmarks, analyzed
> results in depth (up to the hardware and kernel drivers!), I wrote new
> tools and enhanced existing tools.
>
> * I wrote a new perf module which runs benchmarks in a reliable way
> and contains a LOT of features: collect metadata, JSON file format,
> commands to compare, render an histogram, etc.
>
> * I rewrote the Python benchmark suite: the old benchmarks Mercurial
> repository moved to a new performance GitHub project which uses my
> perf module and contains more benchmarks.
>
> * I also made minor enhancements to timeit in Python 3.7 -- some dev
> don't want major changes to not "break the backward compatibility".
>
> For timeit, I suggest to use my perf tool which includes a reliable
> timeit command and has much more features like --duplicate (repeat the
> statements to reduce the cost of the outer loop) and --compare-to
> (compare two versions of Python), but also all builtin perf features
> (JSON output, statistics, histogram, etc.).
>
> I added benchmarks from PyPy and Pyston benchmark suites to
> performance: performance 0.3.1 contains 51 benchmark scripts which run
> a total of 121 benchmarks. Example of tested Python modules:
>
> * SQLAlchemy
> * Dulwich (full Git implementation in Python)
> * Mercurial (currently only the startup time)
> * html5lib
> * pyaes (AES crypto cipher in pure Python)
> * sympy
> * Tornado (HTTP client and server)
> * Django (sadly, only the template engine right now, Pyston contains
> HTTP benchmarks)
> * pathlib
> * spambayes
>
> More benchmarks will be added later. It would be nice to add
> benchmarks on numpy for example, numpy is important for a large part
> of our community.
>
> All these (new or updated) tools can now be used to take smarter
> decisions on optimizations. Please don't push any optimization anymore
> without providing reliable benchmark results!
>
>
> My first major action was to close the latest attempt to
> micro-optimize int+int in Python/ceval.c,
> http://bugs.python.org/issue21955 : I closed the issue as rejected,
> because there is no significant speedup on benchmarks other than two
> (tiny) microbenchmarks. To make sure that no one looses its time on
> trying to micro-optimize int+int, I even added a comment to
> Python/ceval.c :-)
>
>https://hg.python.org/cpython/rev/61fcb12a9873
>"Please don't try to micro-optimize int+int"
>
>
> The perf and performance are now well tested: Travis CI runs tests on
> the new commits and pull requests, and the "tox" command can be used
> locally to test different Python versions, pep8, doc, ... in a single
> command.
>
>
> Next steps:
>
> * Run performance 0.3.1 on speed.python.org: the benchmark runner is
> currently stopped (and still uses the old benchmarks project). The
> website part may be updated to allow to download full JSON files which
> includes *all* information (all timings, metadata and more).
>
> * I plan to run performance on CPython 2.7, CPython 3.7, PyPy and PyPy
> 3. Maybe also CPython 3.5 and CPython 3.6 if they don't take too much
> resources.
>
> * Later, we can consider adding more implementations of Python:
> Jython, IronPython, MicroPython, Pyston, Pyjion, etc. All benchmarks
> should be run on the same hardware to be comparable.
>
> * Later, we might also allow other projects to upload their own
> benchmark results, but we should find a solution to groups benchmark
> results per benchmark runner (ex: at least by the hostname, perf JSON
> contains the hostname) to not compare two results from two different
> hardware
>
> * We should continue to add more benchmarks to the performance
> benchmark suite, especially benchmarks more representative of real
> applications (we have enough microbenchmarks!)
>
>
> Links:
>
> * perf: http://perf.readthedocs.io/
> * performance: https://github.com/python/performance
> * Python Speed mailing list: https://mail.python.org/mailman/listinfo/speed
> * https://speed.python.org/ (currently outdated, and don't use performance 
> yet)
>
> See https://pypi.python.org/pypi/performance which contains even more
> links to Python benchmarks (PyPy, Pyston, Numba, Pythran, etc.)
>
> Victor
> ___
> Python-Dev mailing list
> [email protected]
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: 
> https://mail.python.org/mailman/options/python-dev/fijall%40gmail.com
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Benchmarking Python and micro-optimizations

2016-10-20 Thread Eric Snow

On Thu, Oct 20, 2016 at 4:56 AM, Victor Stinner
 wrote:
> Hi,
>
> Last months, I worked a lot on benchmarks. I ran benchmarks, analyzed
> results in depth (up to the hardware and kernel drivers!), I wrote new
> tools and enhanced existing tools.

This is a massive contribution.  Thanks!

> All these (new or updated) tools can now be used to take smarter
> decisions on optimizations. Please don't push any optimization anymore
> without providing reliable benchmark results!

+1

-eric
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Have I got my hg dependencies correct?

2016-10-20 Thread Skip Montanaro

On Thu, Oct 20, 2016 at 7:35 AM, Victor Stinner 
wrote:
>
> Are you on the 2.7 branch or the default branch?
>
> You might try to cleanup your checkout:
>
> hg up -C -r 2.7
> make distclean
> hg purge # WARNING! it removes *all* files not tracked by Mercurial
> ./configure && make
>
> You should also paste the full error message.

I am on the 2.7 branch. My tree looks like this:

~/src/hgpython/
   cpython
   3.6
   3.5
   3.4
   3.3
   3.2
   2.7

As I indicated, the cpython repo pulls from the central repo. The 3.6 and
2.7 branches pull from cpython. The other 3.x branches pull from the 3.x+1
branch.

Sorry, I no longer have the error message. I wasn't thinking in the correct
order, and executed the suggested hg purge command before retrieving the
error message from my build.out file.

In any case, there seemed to be something about the hg up command:

% hg up -C -r 2.7
6 files updated, 0 files merged, 0 files removed, 0 files unresolved

A plain hg up command didn't update any files. Looking at 2.7/
Makefile.pre.in, I see it now has the necessary target for _math.o. That
must have been one of the six updated files.

Is there some deficiency in a plain hg update command which suggests I
should always use the more complex command you suggested? It's not a huge
deal typing-wise, as I use a shell script to rebuild my world, doing the
necessary work to update and build each branch.

Skip
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Benchmarking Python and micro-optimizations

2016-10-20 Thread Yury Selivanov


Thank you Victor!  This is a massive amount of work.


On 2016-10-20 6:56 AM, Victor Stinner wrote:

* I plan to run performance on CPython 2.7, CPython 3.7, PyPy and PyPy
3. Maybe also CPython 3.5 and CPython 3.6 if they don't take too much
resources.


I think it's important to run 3.5 & 3.6 to see if we introduce some 
regressions in performance.


Yury
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Benchmarking Python and micro-optimizations

2016-10-20 Thread Ethan Furman


On 10/20/2016 03:56 AM, Victor Stinner wrote:


Last months, I worked a lot on benchmarks. I ran benchmarks, analyzed
results in depth (up to the hardware and kernel drivers!), I wrote new
tools and enhanced existing tools.


Thank you!

--
~Ethan~
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Have I got my hg dependencies correct?

2016-10-20 Thread Zachary Ware

On Thu, Oct 20, 2016 at 11:23 AM, Skip Montanaro
 wrote:
>
> On Thu, Oct 20, 2016 at 7:35 AM, Victor Stinner 
> wrote:
>>
>> Are you on the 2.7 branch or the default branch?
>>
>> You might try to cleanup your checkout:
>>
>> hg up -C -r 2.7
>> make distclean
>> hg purge # WARNING! it removes *all* files not tracked by Mercurial
>> ./configure && make
>>
>> You should also paste the full error message.
>
> I am on the 2.7 branch. My tree looks like this:
>
> ~/src/hgpython/
>cpython
>3.6
>3.5
>3.4
>3.3
>3.2
>2.7
>
> As I indicated, the cpython repo pulls from the central repo. The 3.6 and
> 2.7 branches pull from cpython. The other 3.x branches pull from the 3.x+1
> branch.
>
> Sorry, I no longer have the error message. I wasn't thinking in the correct
> order, and executed the suggested hg purge command before retrieving the
> error message from my build.out file.
>
> In any case, there seemed to be something about the hg up command:
>
> % hg up -C -r 2.7
> 6 files updated, 0 files merged, 0 files removed, 0 files unresolved
>
> A plain hg up command didn't update any files. Looking at
> 2.7/Makefile.pre.in, I see it now has the necessary target for _math.o. That
> must have been one of the six updated files.
>
> Is there some deficiency in a plain hg update command which suggests I
> should always use the more complex command you suggested? It's not a huge
> deal typing-wise, as I use a shell script to rebuild my world, doing the
> necessary work to update and build each branch.

Sounds like you had an unexpected patch to Makefile.pre.in (and 5
other files) which broke things.  Plain `hg update` will try to keep
any changes in your working directory intact, whereas `hg update -C`
blows away any changes to tracked files.

-- 
Zach
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Benchmarking Python and micro-optimizations

2016-10-20 Thread Serhiy Storchaka


On 20.10.16 13:56, Victor Stinner wrote:

Last months, I worked a lot on benchmarks. I ran benchmarks, analyzed
results in depth (up to the hardware and kernel drivers!), I wrote new
tools and enhanced existing tools.


Great work! Thank you Victor.


___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Benchmarking Python and micro-optimizations

2016-10-20 Thread Paul Moore

On 20 October 2016 at 17:38, Ethan Furman  wrote:
> On 10/20/2016 03:56 AM, Victor Stinner wrote:
>
>> Last months, I worked a lot on benchmarks. I ran benchmarks, analyzed
>> results in depth (up to the hardware and kernel drivers!), I wrote new
>> tools and enhanced existing tools.
>
>
> Thank you!

Indeed, this is brilliant - and often unappreciated - work, so many
thanks for all the work you've put into this.
Paul
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Python-Dev Digest, Vol 159, Issue 27

2016-10-20 Thread Wang, Peter Xihong

Hi Victor,

Thanks for the great contribution to the unified benchmark development!   In 
addition to the OutReachy program that we are currently supporting, let us know 
how else we could help out in this effort.

Other than micros and benchmarking ideas, we'd also like to hear suggestions 
from the community on workload development around real world use cases, 
especially in the enterprise world, cloud computing, data analytics, machine 
learning, high performance computing, etc.

Thanks,

Peter
 

-Original Message-
From: Python-Dev 
[mailto:[email protected]] On Behalf Of 
[email protected]
Sent: Thursday, October 20, 2016 9:00 AM
To: [email protected]
Subject: Python-Dev Digest, Vol 159, Issue 27

Send Python-Dev mailing list submissions to
[email protected]

To subscribe or unsubscribe via the World Wide Web, visit
https://mail.python.org/mailman/listinfo/python-dev
or, via email, send a message with subject or body 'help' to
[email protected]

You can reach the person managing the list at
[email protected]

When replying, please edit your Subject line so it is more specific than "Re: 
Contents of Python-Dev digest..."


Today's Topics:

   1. Benchmarking Python and micro-optimizations (Victor Stinner)
   2. Have I got my hg dependencies correct? (Skip Montanaro)
   3. Re: Have I got my hg dependencies correct? (Skip Montanaro)
   4. Re: Have I got my hg dependencies correct? (Victor Stinner)
   5. Re: Benchmarking Python and micro-optimizations
  (Maciej Fijalkowski)
   6. Re: Benchmarking Python and micro-optimizations (Eric Snow)


--

Message: 1
Date: Thu, 20 Oct 2016 12:56:06 +0200
From: Victor Stinner 
To: Python Dev 
Subject: [Python-Dev] Benchmarking Python and micro-optimizations
Message-ID:

Content-Type: text/plain; charset=UTF-8

Hi,

Last months, I worked a lot on benchmarks. I ran benchmarks, analyzed results 
in depth (up to the hardware and kernel drivers!), I wrote new tools and 
enhanced existing tools.

* I wrote a new perf module which runs benchmarks in a reliable way and 
contains a LOT of features: collect metadata, JSON file format, commands to 
compare, render an histogram, etc.

* I rewrote the Python benchmark suite: the old benchmarks Mercurial repository 
moved to a new performance GitHub project which uses my perf module and 
contains more benchmarks.

* I also made minor enhancements to timeit in Python 3.7 -- some dev don't want 
major changes to not "break the backward compatibility".

For timeit, I suggest to use my perf tool which includes a reliable timeit 
command and has much more features like --duplicate (repeat the statements to 
reduce the cost of the outer loop) and --compare-to (compare two versions of 
Python), but also all builtin perf features (JSON output, statistics, 
histogram, etc.).

I added benchmarks from PyPy and Pyston benchmark suites to
performance: performance 0.3.1 contains 51 benchmark scripts which run a total 
of 121 benchmarks. Example of tested Python modules:

* SQLAlchemy
* Dulwich (full Git implementation in Python)
* Mercurial (currently only the startup time)
* html5lib
* pyaes (AES crypto cipher in pure Python)
* sympy
* Tornado (HTTP client and server)
* Django (sadly, only the template engine right now, Pyston contains HTTP 
benchmarks)
* pathlib
* spambayes

More benchmarks will be added later. It would be nice to add benchmarks on 
numpy for example, numpy is important for a large part of our community.

All these (new or updated) tools can now be used to take smarter decisions on 
optimizations. Please don't push any optimization anymore without providing 
reliable benchmark results!


My first major action was to close the latest attempt to micro-optimize int+int 
in Python/ceval.c,
http://bugs.python.org/issue21955 : I closed the issue as rejected, because 
there is no significant speedup on benchmarks other than two
(tiny) microbenchmarks. To make sure that no one looses its time on trying to 
micro-optimize int+int, I even added a comment to Python/ceval.c :-)

   https://hg.python.org/cpython/rev/61fcb12a9873
   "Please don't try to micro-optimize int+int"


The perf and performance are now well tested: Travis CI runs tests on the new 
commits and pull requests, and the "tox" command can be used locally to test 
different Python versions, pep8, doc, ... in a single command.


Next steps:

* Run performance 0.3.1 on speed.python.org: the benchmark runner is currently 
stopped (and still uses the old benchmarks project). The website part may be 
updated to allow to download full JSON files which includes *all* information 
(all timings, metadata and more).

* I plan to run performance on CPython 2.7, CPython 3.7, PyPy and PyPy 3. Maybe 
also CPython 3.5 and CPython 3.6 if they don't take too much resources.

* Later, we can

Re: [Python-Dev] Adding bytes.frombuffer() constructor to PEP 467 (was: [Python-ideas] Adding bytes.frombuffer() constructor

2016-10-20 Thread Nick Coghlan

On 19 October 2016 at 01:28, Chris Barker - NOAA Federal
 wrote:
> def get_builtin_methods():
>>... return [(name, method_name) for name, obj in
>> get_builtin_types().items() for method_name, method in
>> vars(obj).items() if not method_name.startswith("__")]
>>...
> len(get_builtin_methods())
>>230
>
> So what? No one looks in all the methods of builtins at once.

Yes, Python implementation developers do, which is why it's a useful
part of defining the overall "size" of Python and how that is growing
over time.

When we define a new standard library module (particularly pure Python
ones) rather than new methods on builtin types, we create
substantially less additional work for other implementations, and we
make it easier for educators to decide whether or not they should be
introducing their students to the new capabilities.

That latter aspect is important, as providing functionality as
separate modules means we also gain an enhanced ability to explain
"What is this *for*?", which is something we regularly struggle with
when making changes to the core language to better support relatively
advanced domain specific use cases (see
http://learning-python.com/books/python-changes-2014-plus.html for one
generalist author's perspective on the vast gulf that can arise
between "What professional programmers want" and "What's relevant to
new programmers")

> If we
> have anything like an OO System (and python builtins only sort of
> do...) then folks look for a built in that they need, and only then
> look at its methods.
>
> If you need to work with bytes, you'll look at the bytes object and
> bytarray object. Having to go find some helper function module to know
> to efficiently do something with bytes is VERY non-discoverable!

Which is more comprehensible and discoverable, dict.setdefault(), or
collections.defaultdict()?

Micro-optimisations like dict.setdefault() typically don't make sense
in isolation - they only make sense in the context of a particular
pattern of thought. Now, one approach to such patterns is to say "We
just need to do a better job of teaching people to recognise and use
the pattern!". This approach tends not to work very well - you're
often better off extracting the entire pattern out to a higher level
construct, giving that construct a name, and teaching that, and
letting people worry about how it works internally later.

(For a slightly different example, consider the rationale for adding
the `secrets` module, even though it's mostly just a collection of
relatively thin wrappers around `os.urandom()`)

> bytes and bytarray are already low-level objects -- adding low-level
> functionality to them makes perfect sense.

They're not really that low level. They're *relatively* low level
(especially for Python), but they're still a long way away from the
kind of raw control over memory layout that a language like C or Rust
can give you.

> And no, this is not just for asycio at all -- it's potentially useful
> for any byte manipulation.

Yes, which is why I think the end goal should be a public `iobuffers`
module in the standard library. Doing IO buffer manipulation
efficiently is a complex topic, but it's also one where there are:

- many repeatable patterns for managing IO buffers efficiently that
aren't necessarily applicable to manipulating arbitrary binary data
(ring buffers, ropes, etc)
- many operating system level utilities available to make it even more
efficient that we currently don't use (since we only have general
purpose "bytes" and "bytearray" objects with no "iobuffer" specific
abstraction that could take advantage of those use case specific
features)

> +1 on a frombuffer() method.

Still -1 in the absence of evidence that a good IO buffer abstraction
for asyncio and the standard library can't be written without it
(where the evidence I'll accept is "We already wrote the abstraction
layer, and not having this builtin feature necessarily introduces
inefficiencies or a lack of portability beyond CPython into our
implementation").

>> Putting special purpose functionality behind an import gate helps to
>> provide a more explicit context of use
>
> This is a fine argument for putting bytearray in a separate module --
> but that ship has sailed. The method to construct a bytearray from a
> buffer belongs with the bytearray object.

The bytearray constructor already accepts arbitrary bytes-like
objects. What this proposal is about is a way to *more efficiently*
snapshot a slice of a bytearray object for use in asyncio buffer
manipulation in cases where all of the following constraints apply:

- we don't want to copy the data twice
- we don't want to let a memoryview be cleaned up lazily
- we don't want to incur the readability penalty of explicitly
managing the memoryview

For a great many use cases, we simply don't care about those
constraints (especially the last one), so adding `bytes.frombuffer` is
just confusing: we can readily predict that after addin

[Python-Dev] Benchmarking Python and micro-optimizations

[Python-Dev] Have I got my hg dependencies correct?

Re: [Python-Dev] Have I got my hg dependencies correct?

Re: [Python-Dev] Have I got my hg dependencies correct?

Re: [Python-Dev] Benchmarking Python and micro-optimizations

Re: [Python-Dev] Benchmarking Python and micro-optimizations

Re: [Python-Dev] Have I got my hg dependencies correct?

Re: [Python-Dev] Benchmarking Python and micro-optimizations

Re: [Python-Dev] Benchmarking Python and micro-optimizations

Re: [Python-Dev] Have I got my hg dependencies correct?

Re: [Python-Dev] Benchmarking Python and micro-optimizations

Re: [Python-Dev] Benchmarking Python and micro-optimizations

Re: [Python-Dev] Python-Dev Digest, Vol 159, Issue 27

Re: [Python-Dev] Adding bytes.frombuffer() constructor to PEP 467 (was: [Python-ideas] Adding bytes.frombuffer() constructor

14 matches

Site Navigation

Mail list logo

Footer information