Re: The problems for the rootless subhurd

Da Zheng Mon, 08 Jun 2009 23:26:08 -0700

olafbuddenha...@gmx.net wrote:

In order to track all  tasks in subhurd, boot works as a proxy for all

RPCs on the task port,

[...]

However, it seems to be the source of the most serious  bug in my
modified boot.


BUG: After I added the proxy for all RPCs to 'boot', I find that
subhurd  sometimes failed to boot. For example, it sometimes stops
booting after  the system displays "GNU 0.3 (hurd) (console)" and it
sometimes boots  successfully and displays "login>" but stops working
after I try to  login. Sometimes, it even prints the error message
like

   getty[47]: /bin/login: No such file or directory Use `login USER'
   to login, or `help' for more information.

Of course, sometimes subhurd can boot and I can login successfully.


Sounds like some kind of race condition... But I don't know where.

You could try tracing all RPCs made to the proxy (using some logging
mechanism in the proxy itself, or perhaps rpctrace), and comparing the
results of various runs...

As I mentioned before, the subhurd sometimes hangs. I think I have foundone of the places where subhurd hangs.


The boot now proxies all RPCs that are sent on the task port.

The proxy works in a signal thread and it only forwards the requests ofmost RPCs and their replies are sent back to subhurd by the kernel. Buttask_create, vm_set_default_memory_manager, processor_set_tasks andhost_processor_set_priv are handled by the proxy and their replies aresent back directly.One place where subhurd hangs is when the exec server calls vm_map atsome point. The proxy fails to forward the request of vm_map andmach_msg is blocked.

The code of forwarding messages is as follows:

debug ("request %d to %d, real target: %d", inp->msgh_id, target,task_pi->task_port);

 /* Resend the message to the tracee.  */
 err = mach_msg (inp, MACH_SEND_MSG | MACH_SEND_TIMEOUT, inp->msgh_size, 0,
                 MACH_PORT_NULL, MACH_MSG_TIMEOUT_NONE, MACH_PORT_NULL);
 outp->RetCode = MIG_NO_REPLY;
 if (err)
   {
     info ("mach_msg %d to %d: %s", inp->msgh_id, target, strerror (err));
     debug ("mach_msg %d to %d: %s", inp->msgh_id, target, strerror (err));
     outp->RetCode = err;
   }

out:
 debug ("request %d to %d ends", inp->msgh_id, target);

I have enabled send timeout and the time to wait before giving up is 0(I tried some other values, and it didn't seem to work, either).I don't understand why mach_msg is still blocked even when the sendtimeout is enabled?It is also weird that subhurd hangs only by vm_map called by the execserver (though I sometimes see the subhurd hang by something else, whichis definitely not the RPCs forwarded by boot).

I am thinking if it has something to do with the memory management.e.g., some memory is swapped out, but it cannot be read from the disk.But it should not be possible because the subhurd doesn't have its owndefault memory manager and doesn't have its own swap partitions.


Could anyone have any clues why mach_msg is blocked here?

Thank you,
Zheng Da

Re: The problems for the rootless subhurd

Reply via email to