Re: [Python-Dev] Investigating Python memory footprint of one real Web application
On Fri, Jan 20, 2017 at 1:40 PM, Christian Heimes wrote: > On 2017-01-20 13:15, INADA Naoki wrote: > >> > >> "this script counts static memory usage. It doesn’t care about dynamic > >> memory usage of processing real request" > >> > >> You may be trying to optimize something which is only a very small > >> fraction of your actual memory footprint. That said, the marshal > >> module could certainly try to intern some tuples and other immutable > >> structures. > >> > > > > Yes. I hadn't think static memory footprint is so important. > > > > But Instagram tried to increase CoW efficiency of prefork application, > > and got some success about memory usage and CPU throughput. > > I surprised about it because prefork only shares static memory footprint. > > > > Maybe, sharing some tuples which code object has may increase cache > efficiency. > > I'll try run pyperformance with the marshal patch. > > IIRC Thomas Wouters (?) has been working on a patch to move the ref > counter out of the PyObject struct and into a dedicated memory area. He > proposed the idea to improve cache affinity, reduce cache evictions and > to make CoW more efficient. Especially modern ccNUMA machines with > multiple processors could benefit from the improvement, but also single > processor/multi core machines. > FWIW, I have a working patch for that (against trunk a few months back, even though the original idea was for the gilectomy branch), moving just the refcount and not PyGC_HEAD. Performance-wise, in the benchmarks it's a small but consistent loss (2-5% on a noisy machine, as measured by python-benchmarks, not perf), and it breaks the ABI as well as any code that dereferences PyObject.ob_refcnt directly (the field was repurposed and renamed, and exposed as a const* to avoid direct assignment). It also exposes the API awkwardness that CPython doesn't *require* objects to go through a specific mechanism for object initialisation, even though nearly all extension modules do so. (That same API awkwardness made life a little harder when experimenting with BDW GC :P.) I don't believe external refcounts can be made the default without careful redesigning of a new set of PyObject API calls and deprecation of the old ones. -- Thomas Wouters Hi! I'm an email virus! Think twice before sending your email to help me spread! ___ Python-Dev mailing list [email protected] https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Investigating Python memory footprint of one real Web application
1. It looks like there is still a room for performance improvement of typing w.r.t. how ABCs and issubclass() works. I will try to play with this soon. (the basic idea is that some steps could be avoided for parameterized generics). 2. I am +1 on having three separate options to independently ignore asserts, docstrings, and annotations. 3. I am -1 on ignoring annotations altogether. Sometimes they could be helpful at runtime: typing.NamedTuple and mypy_extensions.TypedDict are two examples. Also some people use annotations for runtime checks or even for things unrelated to typing. I think it would be a pity to lose these functionalities for small performance gains. -- Ivan ___ Python-Dev mailing list [email protected] https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Investigating Python memory footprint of one real Web application
> 3. I am -1 on ignoring annotations altogether. Sometimes they could be > helpful at runtime: typing.NamedTuple and mypy_extensions.TypedDict are two > examples. ignoring annotations doesn't mean ignoring typing at all. You can use typing.NamedTuple even when functions doesn't have __annotations__. > Also some people use annotations for runtime checks or even for things > unrelated to typing. I think it would be a pity to lose these > functionalities for small performance gains. > Sure. It should be option, for backward compatibility. Regards, ___ Python-Dev mailing list [email protected] https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Investigating Python memory footprint of one real Web application
FWIW, I tried to skip compiler_visit_annotations() in Python/compile.c a) default: 41278060 b) remove annotations: 37140094 c) (b) + const merge: 35933436 (a-b)/a = 10% (a-c)/a = 13% And here are top 3 tracebacks from tracemalloc: 15109615 (/180598) File "", line 488 File "", line 780 File "", line 675 File "", line 655 1255632 (/8316) File "/home/inada-n/local/cpython/lib/python3.7/_weakrefset.py", line 84 self.data.add(ref(item, self._remove)) File "/home/inada-n/local/cpython/lib/python3.7/abc.py", line 230 cls._abc_negative_cache.add(subclass) File "/home/inada-n/local/cpython/lib/python3.7/abc.py", line 226 if issubclass(subclass, scls): File "/home/inada-n/local/cpython/lib/python3.7/abc.py", line 226 if issubclass(subclass, scls): 1056744 (/4020) File "/home/inada-n/local/cpython/lib/python3.7/abc.py", line 133 cls = super().__new__(mcls, name, bases, namespace) File "/home/inada-n/local/cpython/lib/python3.7/typing.py", line 125 return super().__new__(cls, name, bases, namespace) File "/home/inada-n/local/cpython/lib/python3.7/typing.py", line 977 self = super().__new__(cls, name, bases, namespace, _root=True) File "/home/inada-n/local/cpython/lib/python3.7/typing.py", line 1105 orig_bases=self.__orig_bases__) Regards, ___ Python-Dev mailing list [email protected] https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Investigating Python memory footprint of one real Web application
2017-01-24 15:00 GMT+01:00 INADA Naoki :
> And here are top 3 tracebacks from tracemalloc:
>
> 15109615 (/180598)
> File "", line 488
> File "", line 780
> File "", line 675
> File "", line 655
FYI at Python startup, usually the largest memory block comes from the
dictionary used to intern all strings ("interned" in unicodeobject.c).
The traceback is never revelant for this specific object.
Victor
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Investigating Python memory footprint of one real Web application
On Tue, Jan 24, 2017 at 11:08 PM, Victor Stinner
wrote:
> 2017-01-24 15:00 GMT+01:00 INADA Naoki :
>> And here are top 3 tracebacks from tracemalloc:
>>
>> 15109615 (/180598)
>> File "", line 488
>> File "", line 780
>> File "", line 675
>> File "", line 655
>
> FYI at Python startup, usually the largest memory block comes from the
> dictionary used to intern all strings ("interned" in unicodeobject.c).
> The traceback is never revelant for this specific object.
>
> Victor
Yes! I used a few hours to notice it.
When PYTHONTRACEMALLOC=10, marshal.loads() of small module (15KB pyc)
looks eating 1.3MB.
I think small stacktrace depth (3~4) is better for showing summary of
large application.
BTW, about 1.3MB of 15MB (mashal.loads()) was for intern dict, as far
as I remember.
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Investigating Python memory footprint of one real Web application
On Jan 24, 2017 3:35 AM, "Thomas Wouters" wrote: On Fri, Jan 20, 2017 at 1:40 PM, Christian Heimes wrote: > On 2017-01-20 13:15, INADA Naoki wrote: > >> > >> "this script counts static memory usage. It doesn’t care about dynamic > >> memory usage of processing real request" > >> > >> You may be trying to optimize something which is only a very small > >> fraction of your actual memory footprint. That said, the marshal > >> module could certainly try to intern some tuples and other immutable > >> structures. > >> > > > > Yes. I hadn't think static memory footprint is so important. > > > > But Instagram tried to increase CoW efficiency of prefork application, > > and got some success about memory usage and CPU throughput. > > I surprised about it because prefork only shares static memory footprint. > > > > Maybe, sharing some tuples which code object has may increase cache > efficiency. > > I'll try run pyperformance with the marshal patch. > > IIRC Thomas Wouters (?) has been working on a patch to move the ref > counter out of the PyObject struct and into a dedicated memory area. He > proposed the idea to improve cache affinity, reduce cache evictions and > to make CoW more efficient. Especially modern ccNUMA machines with > multiple processors could benefit from the improvement, but also single > processor/multi core machines. > FWIW, I have a working patch for that (against trunk a few months back, even though the original idea was for the gilectomy branch), moving just the refcount and not PyGC_HEAD. Performance-wise, in the benchmarks it's a small but consistent loss (2-5% on a noisy machine, as measured by python-benchmarks, not perf), and it breaks the ABI as well as any code that dereferences PyObject.ob_refcnt directly (the field was repurposed and renamed, and exposed as a const* to avoid direct assignment). It also exposes the API awkwardness that CPython doesn't *require* objects to go through a specific mechanism for object initialisation, even though nearly all extension modules do so. (That same API awkwardness made life a little harder when experimenting with BDW GC :P.) I don't believe external refcounts can be made the default without careful redesigning of a new set of PyObject API calls and deprecation of the old ones. The thing I found most surprising about that blog post was that contrary to common wisdom, refcnt updates per se had essentially no effect on the amount of memory shared between CoW processes, and the problems were all due to the cycle collector. (Though I guess it's still possible that part of the problems caused by the cycle collector are due to it touching ob_refcnt.) It's promising too though, because the GC metadata is much less exposed to extension modules than PyObject_HEAD is, and the access patterns are presumably (?) much more bursty. It'd be really interesting to see how things performed if packing just PyGC_HEAD but *not* ob_refcnt into a dedicated region. -n ___ Python-Dev mailing list [email protected] https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Investigating Python memory footprint of one real Web application
On Tue, 24 Jan 2017 10:21:45 -0800 Nathaniel Smith wrote: > > The thing I found most surprising about that blog post was that contrary to > common wisdom, refcnt updates per se had essentially no effect on the > amount of memory shared between CoW processes, and the problems were all > due to the cycle collector. Indeed, it was unexpected, though it can be explained easily: refcount updates touch only the live working set, while GC passes scan through all existing objects, even those that are never actually used. Regards Antoine. ___ Python-Dev mailing list [email protected] https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Generator objects and list comprehensions?
Hi, Glyph pointed this out to me here: http://twistedmatrix.com/pipermail/twisted-python/2017-January/031106.html If I do this on Python 3.6: >> [(yield 1) for x in range(10)] at 0x10cd210f8> If I understand this: https://docs.python.org/3/reference/expressions.html#list-displays then this is a list display and should give a list, not a generator object. Is there a bug in Python, or does the documentation need to be updated? -- Craig ___ Python-Dev mailing list [email protected] https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Generator objects and list comprehensions?
On Wed, Jan 25, 2017 at 4:38 PM, Craig Rodrigues wrote: > > Glyph pointed this out to me here: > http://twistedmatrix.com/pipermail/twisted-python/2017-January/031106.html > > If I do this on Python 3.6: > >>> [(yield 1) for x in range(10)] > at 0x10cd210f8> > > If I understand this: > https://docs.python.org/3/reference/expressions.html#list-displays > then this is a list display and should give a list, not a generator object. > Is there a bug in Python, or does the documentation need to be updated? That looks like an odd interaction between yield expressions and list comps. Since a list comprehension is actually implemented as a nested function, your code actually looks more-or-less like this: >>> def (iter): result = [] for x in iter: result.append((yield 1)) return result >>> (iter(range(10)) at 0x10cd210f8> This function is a generator, and calling it returns what you see above. If you step that iterator, it'll yield 1 ten times, and then raise StopIteration with the resulting list. Based on a cursory examination of the issue at hand, I think what you're working with might be functioning as a coroutine? If so, you may find that using "await" instead of "yield" dodges the problem, as it won't turn the list comp into a generator. But I can't be 100% certain of that. (Also, that would definitely stop you from having single-codebase 2.7/3.x code.) ChrisA ___ Python-Dev mailing list [email protected] https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
