Re: The problems for the rootless subhurd

Da Zheng Sun, 26 Apr 2009 14:02:14 -0700

olafbuddenha...@gmx.net wrote:

However, it seems to be the source of the most serious  bug in my
modified boot.


BUG: After I added the proxy for all RPCs to 'boot', I find that
subhurd  sometimes failed to boot. For example, it sometimes stops
booting after  the system displays "GNU 0.3 (hurd) (console)" and it
sometimes boots  successfully and displays "login>" but stops working
after I try to  login. Sometimes, it even prints the error message
like

   getty[47]: /bin/login: No such file or directory Use `login USER'
   to login, or `help' for more information.

Of course, sometimes subhurd can boot and I can login successfully.


Sounds like some kind of race condition... But I don't know where.

You could try tracing all RPCs made to the proxy (using some logging
mechanism in the proxy itself, or perhaps rpctrace), and comparing the
results of various runs...

I logged all RPCs and tried to analyze them. (antrik, I was wrong. Therearen't 100, 000 RPCs. The number of RPCs to the Mach during the subhurdbooting is about 20,000 - 60,000).I found something abnormal, but I am not sure if it should be consideredas errors. I list all of errors below:

* Lots of mach_port_deallocate are called to deallocate the portwith the name 0. A few of them deallocate a port with the name -1. Sometry to deallocate a port whose name is positive but still fail.* some mach_port_request_notification returns invalid name. All ofthese failed RPCs are sent from the same task and try to cancel the DEADNAME previous notification request.* some vm_region returns "no space available". I assume that it is anormal case. It's possible that there is no region at or above addressin the specified task.* some task_info return "invalid argument" error, probably becausethe task has already died.

The errors below are a bit rare. They don't always appear and don't seemto be related to whether the subhurd can be booted successfully or not.* some vm_allocate returns "invalid argument" error. The Machreference doesn't mention that vm_allocate can return this type oferror. I assume the target task has died.* I also see mach_port_allocate and mach_port_mod_refs return"invalid task" error once.

I don't understand why some programs try to deallocate MACH_PORT_NULL oreven -1.With regard to the failure of mach_port_request_notification, I guessit's because the port of that name has already died. But I fail tounderstand why there is more failure than success when the program triesto cancel the previous notification request. As far as I see, only'console' tries to cancel the notification request.

That's all I find for now.

Zheng Da

Re: The problems for the rootless subhurd

Reply via email to