[issue26588] _tracemalloc: add support for multiple address spaces (domains)

STINNER Victor Tue, 22 Mar 2016 03:19:11 -0700

STINNER Victor added the comment:

Hi,


> - How do you anticipate the integers naming domains will be allocated? Is it 
> like port numbers, and you'll maintain a registry somewhere ("domains 0-100 
> are reserved for the interpreter, pycuda has reserved 100-110, ...")?

I simply have no idea at all :-) I guess that the best is to
collaborate. It's not like we plan to have 1000 different users of the
API. Today, I only know two users: you and Antoine :-D

unsigned int is large: at least 32 bits. I understood that we will
only need something like 10 or maybe 20 domains.

> - You have it configurable at run-time whether the domain gets included in 
> the trace key, and if this is set to false (the default) then all the 
> different domains just get collapsed together. How is some downstream user 
> like pycuda expected to get sensible behavior out of this? Is the idea that 
> pycuda should be guarding all trace/untrace calls with if (use_domain) { ... 
> }, or...?

In my first implementation, support for domains was always enabled.
But I don't like that since it increases the memory footprint even if
you don't use domains, which is the case for *all* users today.

I added a flag, but I'm not happy with the API. For example, it's not
possible to set the flag using PYTHONTRACEMALLOC=nframe env var nor -X
tracemalloc[=nframe] command line option, whereas I'm a big fan of -X
tracemalloc=5 command line option.

Instead of having to enable the flag early and don't allow to change
the flag later, I should try to implement a dynamic flag: start
without domain, and convert the whole table to support domain as soon
as you start to trace at least one trace with domain != 0. I pushed
the first part of my patch (support key bigger than sizeof(void*)) to
be able to focus on the second part.

If fact, converting a table is no more complex than the regular
"rehash" operation when the table becomes too big. This operation is
part of the hashtable design.

> (I guess it could be sensible to support disabling domain stuff -- though if 
> I were writing it I probably wouldn't bother just because it increases the 
> number of configurations that need to be tested etc. -- but if you're doing 
> this then IMO it should discard all trace/untrace calls that refer to 
> non-default domains, i.e. domain=False shouldn't mean "discard domain 
> information", it should mean "trace only the heap domain".)

The design of tracemalloc is more to collect everything, but filter
later. That's why I added an optional domain parameter to
tracemalloc.Filter an a new tracemalloc.DomainFilter filter.

> - Is all this messing about with possibly-unaligned pointers etc. really 
> easier than having one-hashtable-per-domain? You could easily get away with a 
> linearly scanned array, even...

Having one hash table per domain requires to allocate a new hashtable
when we start to use a domain, then decide when it's time to release
it. We also need a "registry" for all trace hashtables: a simple short
array, or maybe even another hashtable?

But I think that we disagree on how traces will used. It looks like
you only care of statistics *per domain*. IMHO it's worth to get the
*total* memory (ex: CPU+GPU memory) per filename or per line number.
But also give the ability to filter domains. For example, exclude CPU
to get statistics of all GPU domains, or get CPU+shmem statistics.

I know that technically, the storage doesn't matter much: we can still
get traces of all domains at once. I like the idea of having all
traces in the same structure, even I have to agree that the new C
macros to get a key are more ugly than the previous simple direct
"void* key" pointer.

Domains are quite strange. They are part of the key in the hashtable,
but _tracemalloc._get_traces() now returns them as part of the "value"
(with my patch) :-)

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue26588>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue26588] _tracemalloc: add support for multiple address spaces (domains)

Reply via email to