On 11/28, Robin Holt wrote: > > We have a customer machine with 4096 cpus. When some user applications > crash, it begins dumping core and can tie up the filesystem and > processors for a considerable period of time. Often, they contact the > user and the user says the core dump files will not be useful and they > reboot the machine. They have already reduced the default core dump size > to not dump anything and taken all reasonable steps to limiting core dumps > while still allowing them to be useful for those users that need them. > They would like to not need to reboot. > > They hoped for a couple changes, one of which is a way for a SIGTERM, > SIGKILL, or something along that line interrupting the core dump process. > Is this the correct direction to take? Are there any better ideas for > handling this?
Well, I don't know what would be the right soultion, but perhaps we can do something like the patch below. Allows to abort the coredump with kill -9. Oleg. --- fs/binfmt_elf.c~ 2007-10-25 16:22:10.000000000 +0400 +++ fs/binfmt_elf.c 2007-11-29 14:47:43.000000000 +0300 @@ -1178,6 +1178,9 @@ out: */ static int dump_write(struct file *file, const void *addr, int nr) { + if (sigismember(¤t->signal->shared_pending.signal, SIGKILL)) + return 0; + return file->f_op->write(file, addr, nr, &file->f_pos) == nr; } - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/