[Twisted-Python] Circular references in TLSMemoryBIOProtocol

2018-01-17 Thread Ilya Skriblovsky
Hello,

I have the Twisted app that serves tons of short-lived TLS connections
using TLSMemoryBIOFactory. I usually set loosened garbage collector
thresholds in production environment for the sake of performance. But I've
noticed that this app's RAM usage quickly grows up to unreasonable values.
Digging into the issue using pdb and objgraph showed that protocol
instances are still living long after they were closed.

I found two circular dependencies which are created for each TLS connection:
1. Between twisted.protocols.policies.ProtocolWrapper and its
self.wrappedProtocol
2. Between twisted.protocols.tls.TLSMemoryBIOProtocol and its
self._tlsConnection

Both of them cause protocol instance to not be deleted when the connection
is closed. So all OpenSSL-related objects and all business-related data
attached to that protocol instance are still living untill the next GC
collection. This affects both RAM usage and performance (due to much more
often GC collections)

I've tried to fix both circular dependencies:

replaced
https://github.com/twisted/twisted/blob/trunk/src/twisted/protocols/policies.py#L75
 by
self.wrappedProtocol.makeConnection(weakref.proxy(self))
and replaced
https://github.com/twisted/twisted/blob/trunk/src/twisted/protocols/tls.py#L199
 by:
self._tlsConnection = self.factory._createConnection(weakref.proxy(self))

Memory usage pattern changed drastically after this change.

I've created demo script that makes 10k TLS loopback connections with GC
disabled and measures the number of objects are still living after the work
is done and total resident RAM consumption:
https://gist.github.com/IlyaSkriblovsky/4dd3abfd5f67c64b13f1c673f56466f9

Output without the fix:
N = 1 , K = 100
objects before 50136
DummyServerProtocols still living 1
objects after 439919
mem 778 mb

Output with the fix:
N = 1 , K = 100
objects before 50133
DummyServerProtocols still living 0
objects after 159919
mem 96 mb

So using weakrefs makes all protocol instances and instances of
TLSMemoryBIOProtocol to be deleted right after a connection is closed. Less
circular-dependent objects → less GC invocations → better performance. And
I see much nicer RAM usage pattern in my app.

Is it possible to fix circular deps in some more clean way? Can this be
solved at all while user's code is able to try to touch both sides of
circular dep after connection is closed? Please advice

Thanks for consideration

Best regards,
Ilya
___
Twisted-Python mailing list
Twisted-Python@twistedmatrix.com
https://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python


Re: [Twisted-Python] Circular references in TLSMemoryBIOProtocol

2018-01-17 Thread L. Daniel Burr
Hi Ilya,

On January 17, 2018 at 3:09:52 PM, Ilya Skriblovsky (ilyaskriblov...@gmail.com) 
wrote:

[Trimmed for context]

So using weakrefs makes all protocol instances and instances of 
TLSMemoryBIOProtocol to be deleted right after a connection is closed. Less 
circular-dependent objects → less GC invocations → better performance. And I 
see much nicer RAM usage pattern in my app.

Is it possible to fix circular deps in some more clean way? Can this be solved 
at all while user's code is able to try to touch both sides of circular dep 
after connection is closed? Please advice


Personally, I don’t mind the weaker approach, but if you wanted to be 
completely explicit, I’d look at modifying the connectionLost method of both 
the protocol and the protocol wrapper to break circular references.

Thanks for consideration

Best regards,
    Ilya

Hope this helps,

Daniel
--
L. Daniel Burr
ldanielb...@me.com
(312) 656-8387


___
Twisted-Python mailing list
Twisted-Python@twistedmatrix.com
https://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python


Re: [Twisted-Python] Circular references in TLSMemoryBIOProtocol

2018-01-17 Thread Glyph


> On Jan 17, 2018, at 1:09 PM, Ilya Skriblovsky  
> wrote:
> 
> Hello,
> 
> I have the Twisted app that serves tons of short-lived TLS connections using 
> TLSMemoryBIOFactory. I usually set loosened garbage collector thresholds in 
> production environment for the sake of performance. But I've noticed that 
> this app's RAM usage quickly grows up to unreasonable values. Digging into 
> the issue using pdb and objgraph showed that protocol instances are still 
> living long after they were closed.

This sounds like an issue that should be reported as a bug and fixed!

It would be great if you could come up with a performance regression test or 
benchmark which could validate that this doesn't regress, but, it's quite 
challenging to do this (especially for memory issues) so as long as it's 
adequately behaviorally tested I'm sure we could accept something.

> I found two circular dependencies which are created for each TLS connection:
> 1. Between twisted.protocols.policies.ProtocolWrapper and its 
> self.wrappedProtocol
> 2. Between twisted.protocols.tls.TLSMemoryBIOProtocol and its 
> self._tlsConnection
> 
> Both of them cause protocol instance to not be deleted when the connection is 
> closed. So all OpenSSL-related objects and all business-related data attached 
> to that protocol instance are still living untill the next GC collection. 
> This affects both RAM usage and performance (due to much more often GC 
> collections)
> 
> I've tried to fix both circular dependencies:
> 
> replaced 
> https://github.com/twisted/twisted/blob/trunk/src/twisted/protocols/policies.py#L75
>  
> 
>  by
> self.wrappedProtocol.makeConnection(weakref.proxy(self))
> and replaced 
> https://github.com/twisted/twisted/blob/trunk/src/twisted/protocols/tls.py#L199
>  
> 
>  by:
> self._tlsConnection = self.factory._createConnection(weakref.proxy(self))
> 
> Memory usage pattern changed drastically after this change.
> 
> I've created demo script that makes 10k TLS loopback connections with GC 
> disabled and measures the number of objects are still living after the work 
> is done and total resident RAM consumption:
> https://gist.github.com/IlyaSkriblovsky/4dd3abfd5f67c64b13f1c673f56466f9 
> 
> 
> Output without the fix:
> N = 1 , K = 100
> objects before 50136
> DummyServerProtocols still living 1
> objects after 439919
> mem 778 mb
> 
> Output with the fix:
> N = 1 , K = 100
> objects before 50133
> DummyServerProtocols still living 0
> objects after 159919
> mem 96 mb
> 
> So using weakrefs makes all protocol instances and instances of 
> TLSMemoryBIOProtocol to be deleted right after a connection is closed. Less 
> circular-dependent objects → less GC invocations → better performance. And I 
> see much nicer RAM usage pattern in my app.

Hooray!

> Is it possible to fix circular deps in some more clean way? Can this be 
> solved at all while user's code is able to try to touch both sides of 
> circular dep after connection is closed? Please advice

Protocols and transports have a fairly defined lifecycle, and as L. Daniel Burr 
already pointed out, it would probably be appropriate to explicitly break these 
reference cycles in connectionLost.

-g

> 
> Thanks for consideration
> 
> Best regards,
> Ilya
> ___
> Twisted-Python mailing list
> Twisted-Python@twistedmatrix.com
> https://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python

___
Twisted-Python mailing list
Twisted-Python@twistedmatrix.com
https://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python