Re: [Python-Dev] The untuned tunable parameter ARENA_SIZE
On Fri, 2 Jun 2017 12:31:19 -0700 Larry Hastings wrote: > > Anyway, I'm not super excited by the prospect of using obmalloc for > larger objects. There's an inverse relation between the size of > allocation and the frequency of allocation. In Python there are lots of > tiny allocations, but fewer and fewer as the size increases. (A > similarly-shaped graph to what retailers call the "long tail".) By no > small coincidence, obmalloc is great at small objects, which is where we > needed the help most. Let's leave it at that. +1 to that and nice explanation. > A more fruitful endeavor might be to try one of these fancy new > third-party allocators in CPython, e.g. tcmalloc, jemalloc. Try each > with both obmalloc turned on and turned off, and see what happens to > performance and memory usage. (I'd try it myself, but I'm already so > far behind on watching funny cat videos.) We should lobby for a ban on funny cat videos so that you spend more time on CPython. Regards Antoine. ___ Python-Dev mailing list [email protected] https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Extremely slow test modules
Hi, Is there a reason some of our tests are excruciatingly slow in `-uall` mode? `test_multiprocessing_spawn` is understandable (after all, it will spawn a new executable for each subprocess), but other tests leave me baffled: - test_tools: 7 min 41 sec - test_tokenize: 6 min 23 sec - test_datetime: 6 min 3 sec - test_lib2to3: 5 min 25 sec [excerpt from recent Travis CI logs] Why does datetime, 2to3 or tokenize testing take so long? And do we have so many tools that it should take 7 minutes to run all of them? I must admit, I don't understand how we got to such a point. Regards Antoine. ___ Python-Dev mailing list [email protected] https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Extremely slow test modules
03.06.17 13:31, Antoine Pitrou пише: Is there a reason some of our tests are excruciatingly slow in `-uall` mode? `test_multiprocessing_spawn` is understandable (after all, it will spawn a new executable for each subprocess), but other tests leave me baffled: - test_tools: 7 min 41 sec - test_tokenize: 6 min 23 sec - test_datetime: 6 min 3 sec - test_lib2to3: 5 min 25 sec [excerpt from recent Travis CI logs] Why does datetime, 2to3 or tokenize testing take so long? And do we have so many tools that it should take 7 minutes to run all of them? I must admit, I don't understand how we got to such a point. test_tools (in particular the test for the unparse.py script), test_tokenize, and test_lib2to3 read and proceed every Python file in the stdlib. This is necessary in full test run because some syntax constructs are very rarely used. This is controlled by the cpy resource. I suggested to disable it on the slowest buildbots (-uall,-cpu). In that case tests are ran only for few random files. test_datetime generates tests for all possible timezones. This is controlled by the tzdata resource and also can be disabled on the slowest buildbots. ___ Python-Dev mailing list [email protected] https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Extremely slow test modules
On Sat, 3 Jun 2017 15:28:18 +0300 Serhiy Storchaka wrote: > > test_tools (in particular the test for the unparse.py script), > test_tokenize, and test_lib2to3 read and proceed every Python file in > the stdlib. This is necessary in full test run because some syntax > constructs are very rarely used. There's no need to parse the whole stdlib for that. Just parse a couple files with the required syntax constructs (for example the test suite, which by construction should have all of them). > This is controlled by the cpy resource. > I suggested to disable it on the slowest buildbots (-uall,-cpu). In that > case tests are ran only for few random files. I don't really care about the buildbots, but I care about CI turnaround. A Travis-CI test run takes 24 minutes. Assuming it uses 4 cores and those 4 tests take more than 6 minutes each, that means we could almost shave 6 minutes (25%) on the duration of the Travis-CI test run. Regards Antoine. ___ Python-Dev mailing list [email protected] https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Extremely slow test modules
On Sat, 3 Jun 2017 15:01:07 +0200 Antoine Pitrou wrote: > > > This is controlled by the cpy resource. > > I suggested to disable it on the slowest buildbots (-uall,-cpu). In that > > case tests are ran only for few random files. > > I don't really care about the buildbots, but I care about CI > turnaround. A Travis-CI test run takes 24 minutes. Assuming it uses 4 > cores and those 4 tests take more than 6 minutes each, that means we > could almost shave 6 minutes (25%) on the duration of the Travis-CI > test run. And if, as it is likely, Travis-CI only exposes 2 CPU cores, we could actually shave 12 minutes off of each 24 minute CI run... Regards Antoine. ___ Python-Dev mailing list [email protected] https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Extremely slow test modules
03.06.17 16:01, Antoine Pitrou пише: On Sat, 3 Jun 2017 15:28:18 +0300 Serhiy Storchaka wrote: test_tools (in particular the test for the unparse.py script), test_tokenize, and test_lib2to3 read and proceed every Python file in the stdlib. This is necessary in full test run because some syntax constructs are very rarely used. There's no need to parse the whole stdlib for that. Just parse a couple files with the required syntax constructs (for example the test suite, which by construction should have all of them). We don't know what these files are. It may be possible (and even likely), that parsing the whole stdlib is not enough. Ideally the tests should parse the whole word, but this is impossible for some reasons. This is controlled by the cpy resource. I suggested to disable it on the slowest buildbots (-uall,-cpu). In that case tests are ran only for few random files. I don't really care about the buildbots, but I care about CI turnaround. A Travis-CI test run takes 24 minutes. Assuming it uses 4 cores and those 4 tests take more than 6 minutes each, that means we could almost shave 6 minutes (25%) on the duration of the Travis-CI test run. They could be disabled on Travis-CI too. ___ Python-Dev mailing list [email protected] https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 7 and braces { .... } on if
Yet about braces. PEP 7 implicitly forbids breaking the line before an
opening brace. An opening brace should stay at the end the line of the
outer compound statement.
if (mro != NULL) {
...
}
else {
...
}
if (type->tp_dictoffset != 0 && base->tp_dictoffset == 0 &&
type->tp_dictoffset == b_size &&
(size_t)t_size == b_size + sizeof(PyObject *)) {
return 0; /* "Forgive" adding a __dict__ only */
}
But the latter example continuation lines are intended at the same level
as the following block of code. I propose to make exception for that
case and allow moving an open brace to the start of the next line.
if (type->tp_dictoffset != 0 && base->tp_dictoffset == 0 &&
type->tp_dictoffset == b_size &&
(size_t)t_size == b_size + sizeof(PyObject *))
{
return 0; /* "Forgive" adding a __dict__ only */
}
This adds a visual separation of a multiline condition from the
following code.
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 7 and braces { .... } on if
On Jun 03, 2017, at 07:25 PM, Serhiy Storchaka wrote:
>But the latter example continuation lines are intended at the same level as
>the following block of code. I propose to make exception for that case and
>allow moving an open brace to the start of the next line.
>
> if (type->tp_dictoffset != 0 && base->tp_dictoffset == 0 &&
> type->tp_dictoffset == b_size &&
> (size_t)t_size == b_size + sizeof(PyObject *))
> {
> return 0; /* "Forgive" adding a __dict__ only */
> }
Agreed!
https://github.com/python/peps/issues/283
https://github.com/python/peps/pull/284
Cheers,
-Barry
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] The untuned tunable parameter ARENA_SIZE
For fun, let's multiply everything by 256: - A "pool" becomes 1 MB. - An "arena" becomes 64 MB. As briefly suggested before, then for any given size class a pool could pass out hundreds of times more objects before needing to fall back on the slower code creating new pools or new arenas. As an added bonus, programs would finish much sooner due to the flurry of segfaults from Py_ADDRESS_IN_RANGE ;-) But greatly increasing the pool size also makes a different implementation of that much more attractive: an obvious one. That is, obmalloc could model its address space with a bit vector, one bit per pool-aligned address. For a given address, shift it right by 20 bits (divide by 1MB) and use what remains as the bit vector index. If the bit is set, obmalloc manages that MB, else (or if the bit address is out of the vector's domain) it doesn't. The system page size would become irrelevant to its operation, and it would play nice with magical memory debuggers (it would never access memory obmalloc hadn't first allocated and initialized itself). A virtual address space span of a terabyte could hold 1M pools, so would "only" need a 1M/8 = 128KB bit vector. That's minor compared to a terabyte (one bit per megabyte). Of course using a bit per 4KB (the current pool size) is less attractive - by a factor of 256. Which is why that wasn't even tried. Note that trying to play the same trick with arenas instead would be at best more complicated. The system calls can't be relied on to return arena-aligned _or_ pool-aligned addresses. obmalloc itself forces pool-alignment of pool base addresses, by (if necessary) ignoring some number of the leading bytes in an arena. That makes useful arithmetic on pool addresses uniform and simple. ___ Python-Dev mailing list [email protected] https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
