Hi Jeremy,

On Nov 19 13:58, Jeremy Drake via Cygwin-patches wrote:
> On Tue, 19 Nov 2024, Corinna Vinschen wrote:
> 
> > On Nov 19 11:06, Jeremy Drake via Cygwin-patches wrote:
> > > (I searched for other callers of terminate_thread after this, and the only
> > > one left without CancelSynchronousIo is in ldap.cc and I'm pretty sure
> > > that's a "can't happen" case.)
> >
> > I'm inclined to push your patch, but I'm not quite sure here...
> >
> > Yes, terminate_thread called from ldap.cc is an unlikely last resort kind
> > of thing.  And there are more places calling terminate_thread().  But none
> > of them are terminating the wait_thread, so they shouldn't matter in this
> > scenario.  I think?
> 
> I grepped in winsup/cygwin for TerminateThread (which only showed up in
> cygthread.cc) and terminate_thread, which shows up in cygthread.cc (where
> it's implemented), flock.cc (where I saw the CancelSynchronousIo trick in
> the first place), ldap.cc, and sigproc.cc (which is patched here).
> 
> I have not seen a hang in places other than at (or near) the termination
> of the wait_thread in proc_terminate.  That doesn't mean there aren't any,
> it is incredibly difficult to get a debugger to tell you where things are
> hung up.  The reason that I think this wait_thread scenario was
> problematic (and the reason that I'm not concerned about the couple of
> times I saw it hung "near" the TerminateThread call rather than at it) is
> that the lock in question was described as being a "code cache lock", and
> this double-fork scenario results in the process exiting almost
> immediately after being started.  The emulator is still translating and
> caching code both for the thread and for the process as a whole, so the
> wait_thread is holding the lock when it's terminated, and the main thread
> needs to acquire the lock to continue translating the rest of Cygwin's
> process shutdown code.

Ok.

> > Assuming they do matter, CancelSynchronousIo() only makes sense
> > if the thread hangs in synchronous IO.  Other internal threads use WFMO,
> > and I don't see that you can avoid a TerminateThread in these cases.
> > But those are covered by the neat SuspendThread/GetThreadContext trick,
> > right?
> 
> Yes, the SuspendThread/GetThreadConext trick makes sure the thread is out
> of emulation before terminating it.  The CancelSyncronousIo was something
> I tried before learning that.  It helped but was not sufficient because
> the thread may not have been in ReadFile at the time, so the thread still
> needed to be terminated in some cases.  I included it in this patch
> because TerminateThread is documented as being unsafe, so any effort to
> avoid using it would be an improvement.
> 
> For threads that are in WFMO, what I do in my own code is include a
> 'shutdown' event that code that wants the thread to shut down can set and
> then wait on the thread object to know when it's done shutting down.

Yeah, we're trying that too, see cygthread::detach().

Usually the code is following the idea of calling TerminateThread() only
in situations where we (quoting MSDN) "know exactly what the target
thread is doing, and you control all of the code that the target thread
could possibly be running at the time of the termination".

Your emulation scenario is beyond that, and there may always lurk YA bug...

Patch pushed.


Thanks,
Corinna

Reply via email to