Hi Jean-Paul,

 

Thank you very much for the detailed answer.  And my appologies for not 
providing OS details; I’ve tested on CentOS and RedHat EL variants, not FreeBSD 
as the ticket discussed.  Looks like Red Hat (EL 7.6) is using epoll reactor, 
and the Windows side is using the select reactor.

 

Thanks for the direction on checking out sys.modules.  To avoid the reactor 
being loaded in the parent process, I can presumably move twisted imports 
within the multiprocessing child modules (from top, down into the run() 
functions).  I will see how far I need to go (e.g. if I can continue using 
Twisted’s JSON logging or if absolutely everything should be isolated until 
after child process startup).  But knowing I need to head that direction for 
epoll or other potential reactor conflicts - is very helpful.

 

Reminds me of the GI Joe cartoon in the early 1980’s that would end with, 
“knowing is half the battle.”

 

-Chris

 

 

From: Twisted-Python <twisted-python-boun...@twistedmatrix.com> On Behalf Of 
Jean-Paul Calderone
Sent: Friday, September 11, 2020 1:28 PM
To: Twisted general discussion <twisted-python@twistedmatrix.com>
Subject: Re: [Twisted-Python] doWrite on twisted.internet.tcp.Port

 

On Fri, Sep 11, 2020 at 1:34 PM <ch...@cmsconstruct.com 
<mailto:ch...@cmsconstruct.com> > wrote:

Hey guys,

 

Last year I hit a condition discussed in this ticket: 
https://twistedmatrix.com/trac/ticket/4759 for doWrite called on a 
twisted.internet.tcp.Port.  

 

I ignored it at the time since it was just on Linux, and my main platform was 
Windows.  Now I’m coming back to it.  I’ll add context on the problem below, 
but first I want to ask a high-level, design-type question with multiprocessing 
and Twisted:

 

Referencing Jean-Paul’s comment at the end of ticket 4759, I read you shouldn’t 
fork a process (multiprocessing module) that already has a Twisted reactor.  
Understood.  But what about a parent process (not doing anything Twisted) 
forking child processes, where each child process starts their own Twisted 
reactor?  Is that intended to work from the Twisted perspective?

 

To answer the asked question, I don't think there is rigorous (or even casual) 
testing of very much of Twisted in the context of "some Twisted code has been 
loaded into memory and then the process forked".  So while it seems like a 
reasonable thing, I wouldn't say there's currently much effort being put 
towards making it a supported usage of Twisted.  Of course this can change at 
almost any moment if someone decides to commit the effort.

 

To dig a bit further into the specific problem, even if you only import the 
reactor in the parent process and then fork a child and try to start the 
reactor in the child, I strongly suspect epollreactor will break.  This is 
because the epoll object is created by reactor instantiation (as opposed to 
being delayed until the reactor is run).  epoll objects have a lot of weird 
behavior.  See the Questions and Answers section of the epoll(7) man page.

 

I don't know if this is the cause of your particular expression of these 
symptoms (it certainly doesn't apply to the original bug report which is on 
FreeBSD where there is no epoll) but it's at least a possible cause.

 

This could probably be fixed in Twisted by only creating the epoll object when 
run is called.  There's nothing particularly difficult about that change but it 
does involve touching a lot of the book-keeping logic since that all assumes it 
can register file descriptors before the reactor is started (think 
reactor.listenTCP(...); reactor.run()).

 

I'm not sure but it may also be the case that only delaying creation of the 
waker until the reactor starts would also fix this.  This is because as long as 
the epoll object remains empty a lot of the weird behavior is avoided and the 
waker is probably the only thing that actually gets added to it if you're just 
importing the reactor but not running it before forking.

 

Alternatively, your application should be able to fix it by studiously avoiding 
the import of twisted.internet.reactor (directly or transitively, of course).  
You could add some kind of assertion about the state of sys.modules immediately 
before your forking code to develop some confidence you've managed this.

 

And if this is really an epoll problem then switching to poll or select reactor 
would also presumably get rid of the issue.

 

Jean-Paul

_______________________________________________
Twisted-Python mailing list
Twisted-Python@twistedmatrix.com
https://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python

Reply via email to