Hi, Quoting Flávio Cruz (2015-08-27 00:26:13) > Em qua, 26 de ago de 2015 às 12:18, Justus Winter < > 4win...@informatik.uni-hamburg.de> escreveu: > > > > After looking around and using KDB, I've figured out that the following > loop in > > kern/exceptions.c/exception_raise_continue_slow was the culprit: > > Could you elaborate a little on your method? > > > I changed the code to chain all the allocated ports together in a linked list. > Then, I added new fields to ipc_port to track send-only rights and to > understand where those rights were being created and where the port was being > manipulated. Finally, I added a new KDB command to show all the inactive > ports. > It was then easy to track exactly how the port was handled inside > exception_raise_continue_slow.
Neat. Thanks for sharing. Fwiw, I've used Coccinelle for code instrumentations. Also for fixing up code in a mechanical way. It is an awesome tool :) > > while (mr == MACH_RCV_INTERRUPTED) { > > /* > > * Somebody is trying to force this thread > > * to a clean point. We must cooperate > > * and then resume the receive. > > */ > > > > while (thread_should_halt(self)) { > > /* don't terminate while holding a reference */ > > if (self->ast & AST_TERMINATE) > > ipc_port_release(reply_port); > > thread_halt_self(); > > } > > > > ip_lock(reply_port); > > .... > > } > > > > The reply port of the target thread (from CLISP?) is not being released > since > > it enters the while(thread_should_halt(self)) loop and the thread > terminates > > inside thread_halt_self but the previous condition does not hold, which > fails > > to release the port. I also think that once the code enters > thread_halt_self, > > it never comes back since the stack is discarded (for both if cases > inside the > > function). > > While I agree that the function thread_halt_self does not return, it > only terminates the thread if the AST_TERMINATE flag is set. > Otherwise, it returns to userspace. > > > Yes, it returns to userspace using thread_exception_return, but notice that a > reference must be released anyway (check first statement after the while > loop). Right. We have two references, one for the send-once right, and one for the port being saved in ith_port (see exception.c:386). > For the else case in thread_halt_self, the thread may throw away the current > stack and run thread_exception_return in thread_block, which is also executed > in the case of success in exception_raise_continue_slow. And indeed exception_raise_continue_fast releases two references too. > I find the use of `while' dodgy in the first place b/c if > thread_halt_self doesn't return, an if would do the trick as well and > don't imply the chance of that code looping... > > I think there might be some situations where it will loop around depending on > the scheduling order (see line 798 on sched_prim.c). Could be. I keep messing up the semantics of thread_invoke in my head >,< > I've added a continuation argument to thread_halt_self so that the port is > released if the thread resumes using the continuation (followed by > thread_exception_return). Otherwise, it may loop and it will eventually also > release the reference and call thread_exception_return. Yes. This looks like the clean solution. > I've also made some cosmetic changes to the code. Would be best to split the change up nevertheless. Thanks for looking into this :) Justus