I always thought the best solution for a university system, is to implement the fair share scheduler. thus people can use any resource the want on an idle machine, but a saturated machine splits its load based on rules.
I had this experience when I managed a cray (YMP-El only) and it worked perfectly. it did not limit processes per user, but it could. similarly ram use and network bandwidth - though this only works per machine. personally I like erik's idea, though I would make it a boot option. I would also keep malloc() returning nil on fail, but add emalloc() to libc. and finally have an environment car that makes emalloc() clas sys fatal)) or abort() - for debug. Steve > On 28 Jan 2015, at 06:50, arisawa <aris...@ar.aichi-u.ac.jp> wrote: > > Hello, > >> nonetheless, i have experience running multi-user plan 9 systems, and users >> were not usually the issue. > > Eric’s users are all gentleman, all careful people and all skillful > programmers. > If your system is served for university students, you will have different > thought. > >> i think you've turned a problem with bounded recovery time into a >> situation where the recovery code itself will inadvertently dos attack its >> users. > > in case that a process failed in getting resource such as memory or process, > what it should do is very limited: puts out some message and exits. > this is right behavior. > I have never seen programs that retry malloc() or fork() until succeed. > if all processes retry them, the system will get down. > this is what I have observed in current plan9 kernel. > > if any one has cleaner solution, i.e., a solution that never kill innocent > process, > I want to see it. > >