Re: [Python-Dev] The untuned tunable parameter ARENA_SIZE

2017-06-03 Thread Antoine Pitrou
On Fri, 2 Jun 2017 12:31:19 -0700
Larry Hastings  wrote:
> 
> Anyway, I'm not super excited by the prospect of using obmalloc for 
> larger objects.  There's an inverse relation between the size of 
> allocation and the frequency of allocation.  In Python there are lots of 
> tiny allocations, but fewer and fewer as the size increases.  (A 
> similarly-shaped graph to what retailers call the "long tail".)  By no 
> small coincidence, obmalloc is great at small objects, which is where we 
> needed the help most.  Let's leave it at that.

+1 to that and nice explanation.

> A more fruitful endeavor might be to try one of these fancy new 
> third-party allocators in CPython, e.g. tcmalloc, jemalloc.  Try each 
> with both obmalloc turned on and turned off, and see what happens to 
> performance and memory usage.  (I'd try it myself, but I'm already so 
> far behind on watching funny cat videos.)

We should lobby for a ban on funny cat videos so that you spend more
time on CPython.

Regards

Antoine.


___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Extremely slow test modules

2017-06-03 Thread Antoine Pitrou

Hi,

Is there a reason some of our tests are excruciatingly slow in `-uall`
mode?  `test_multiprocessing_spawn` is understandable (after all, it
will spawn a new executable for each subprocess), but other tests leave
me baffled:

- test_tools: 7 min 41 sec
- test_tokenize: 6 min 23 sec
- test_datetime: 6 min 3 sec
- test_lib2to3: 5 min 25 sec
[excerpt from recent Travis CI logs]

Why does datetime, 2to3 or tokenize testing take so long?  And do we
have so many tools that it should take 7 minutes to run all of them?
I must admit, I don't understand how we got to such a point.

Regards

Antoine.


___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Extremely slow test modules

2017-06-03 Thread Serhiy Storchaka

03.06.17 13:31, Antoine Pitrou пише:

Is there a reason some of our tests are excruciatingly slow in `-uall`
mode?  `test_multiprocessing_spawn` is understandable (after all, it
will spawn a new executable for each subprocess), but other tests leave
me baffled:

- test_tools: 7 min 41 sec
- test_tokenize: 6 min 23 sec
- test_datetime: 6 min 3 sec
- test_lib2to3: 5 min 25 sec
[excerpt from recent Travis CI logs]

Why does datetime, 2to3 or tokenize testing take so long?  And do we
have so many tools that it should take 7 minutes to run all of them?
I must admit, I don't understand how we got to such a point.


test_tools (in particular the test for the unparse.py script), 
test_tokenize, and test_lib2to3 read and proceed every Python file in 
the stdlib. This is necessary in full test run because some syntax 
constructs are very rarely used. This is controlled by the cpy resource. 
I suggested to disable it on the slowest buildbots (-uall,-cpu). In that 
case tests are ran only for few random files.


test_datetime generates tests for all possible timezones. This is 
controlled by the tzdata resource and also can be disabled on the 
slowest buildbots.


___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Extremely slow test modules

2017-06-03 Thread Antoine Pitrou
On Sat, 3 Jun 2017 15:28:18 +0300
Serhiy Storchaka  wrote:
> 
> test_tools (in particular the test for the unparse.py script), 
> test_tokenize, and test_lib2to3 read and proceed every Python file in 
> the stdlib. This is necessary in full test run because some syntax 
> constructs are very rarely used.

There's no need to parse the whole stdlib for that.  Just parse a
couple files with the required syntax constructs (for example the test
suite, which by construction should have all of them).

> This is controlled by the cpy resource. 
> I suggested to disable it on the slowest buildbots (-uall,-cpu). In that 
> case tests are ran only for few random files.

I don't really care about the buildbots, but I care about CI
turnaround.  A Travis-CI test run takes 24 minutes.  Assuming it uses 4
cores and those 4 tests take more than 6 minutes each, that means we
could almost shave 6 minutes (25%) on the duration of the Travis-CI
test run.

Regards

Antoine.


___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Extremely slow test modules

2017-06-03 Thread Antoine Pitrou
On Sat, 3 Jun 2017 15:01:07 +0200
Antoine Pitrou  wrote:
> 
> > This is controlled by the cpy resource. 
> > I suggested to disable it on the slowest buildbots (-uall,-cpu). In that 
> > case tests are ran only for few random files.  
> 
> I don't really care about the buildbots, but I care about CI
> turnaround.  A Travis-CI test run takes 24 minutes.  Assuming it uses 4
> cores and those 4 tests take more than 6 minutes each, that means we
> could almost shave 6 minutes (25%) on the duration of the Travis-CI
> test run.

And if, as it is likely, Travis-CI only exposes 2 CPU cores, we could
actually shave 12 minutes off of each 24 minute CI run...

Regards

Antoine.


___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Extremely slow test modules

2017-06-03 Thread Serhiy Storchaka

03.06.17 16:01, Antoine Pitrou пише:

On Sat, 3 Jun 2017 15:28:18 +0300
Serhiy Storchaka  wrote:


test_tools (in particular the test for the unparse.py script),
test_tokenize, and test_lib2to3 read and proceed every Python file in
the stdlib. This is necessary in full test run because some syntax
constructs are very rarely used.


There's no need to parse the whole stdlib for that.  Just parse a
couple files with the required syntax constructs (for example the test
suite, which by construction should have all of them).


We don't know what these files are. It may be possible (and even 
likely), that parsing the whole stdlib is not enough. Ideally the tests 
should parse the whole word, but this is impossible for some reasons.



This is controlled by the cpy resource.
I suggested to disable it on the slowest buildbots (-uall,-cpu). In that
case tests are ran only for few random files.


I don't really care about the buildbots, but I care about CI
turnaround.  A Travis-CI test run takes 24 minutes.  Assuming it uses 4
cores and those 4 tests take more than 6 minutes each, that means we
could almost shave 6 minutes (25%) on the duration of the Travis-CI
test run.


They could be disabled on Travis-CI too.

___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 7 and braces { .... } on if

2017-06-03 Thread Serhiy Storchaka
Yet about braces. PEP 7 implicitly forbids breaking the line before an 
opening brace. An opening brace should stay at the end the line of the 
outer compound statement.


if (mro != NULL) {
...
}
else {
...
}

if (type->tp_dictoffset != 0 && base->tp_dictoffset == 0 &&
type->tp_dictoffset == b_size &&
(size_t)t_size == b_size + sizeof(PyObject *)) {
return 0; /* "Forgive" adding a __dict__ only */
}

But the latter example continuation lines are intended at the same level 
as the following block of code. I propose to make exception for that 
case and allow moving an open brace to the start of the next line.


if (type->tp_dictoffset != 0 && base->tp_dictoffset == 0 &&
type->tp_dictoffset == b_size &&
(size_t)t_size == b_size + sizeof(PyObject *))
{
return 0; /* "Forgive" adding a __dict__ only */
}

This adds a visual separation of a multiline condition from the 
following code.


___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 7 and braces { .... } on if

2017-06-03 Thread Barry Warsaw
On Jun 03, 2017, at 07:25 PM, Serhiy Storchaka wrote:

>But the latter example continuation lines are intended at the same level as
>the following block of code. I propose to make exception for that case and
>allow moving an open brace to the start of the next line.
>
> if (type->tp_dictoffset != 0 && base->tp_dictoffset == 0 &&
> type->tp_dictoffset == b_size &&
> (size_t)t_size == b_size + sizeof(PyObject *))
> {
> return 0; /* "Forgive" adding a __dict__ only */
> }

Agreed!

https://github.com/python/peps/issues/283
https://github.com/python/peps/pull/284

Cheers,
-Barry
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] The untuned tunable parameter ARENA_SIZE

2017-06-03 Thread Tim Peters
For fun, let's multiply everything by 256:

- A "pool" becomes 1 MB.
- An "arena" becomes 64 MB.

As briefly suggested before, then for any given size class a pool
could pass out hundreds of times more objects before needing to fall
back on the slower code creating new pools or new arenas.

As an added bonus, programs would finish much sooner due to the flurry
of segfaults from Py_ADDRESS_IN_RANGE ;-)

But greatly increasing the pool size also makes a different
implementation of that much more attractive:  an obvious one.  That
is, obmalloc could model its address space with a bit vector, one bit
per pool-aligned address.  For a given address, shift it right by 20
bits (divide by 1MB) and use what remains as the bit vector index.  If
the bit is set, obmalloc manages that MB, else (or if the bit address
is out of the vector's domain) it doesn't.  The system page size would
become irrelevant to its operation, and it would play nice with
magical memory debuggers (it would never access memory obmalloc hadn't
first allocated and initialized itself).

A virtual address space span of a terabyte could hold 1M pools, so
would "only" need a 1M/8 = 128KB bit vector.  That's minor compared to
a terabyte (one bit per megabyte).

Of course using a bit per 4KB (the current pool size) is less
attractive - by a factor of 256.  Which is why that wasn't even tried.

Note that trying to play the same trick with arenas instead would be
at best more complicated.  The system calls can't be relied on to
return arena-aligned _or_ pool-aligned addresses.  obmalloc itself
forces pool-alignment of pool base addresses, by (if necessary)
ignoring some number of the leading bytes in an arena.  That makes
useful arithmetic on pool addresses uniform and simple.
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com