On 11/28, Robin Holt wrote:
> 
> We have a customer machine with 4096 cpus.  When some user applications
> crash, it begins dumping core and can tie up the filesystem and
> processors for a considerable period of time.  Often, they contact the
> user and the user says the core dump files will not be useful and they
> reboot the machine.  They have already reduced the default core dump size
> to not dump anything and taken all reasonable steps to limiting core dumps
> while still allowing them to be useful for those users that need them.
> They would like to not need to reboot.
> 
> They hoped for a couple changes, one of which is a way for a SIGTERM,
> SIGKILL, or something along that line interrupting the core dump process.
> Is this the correct direction to take?  Are there any better ideas for
> handling this?

Well, I don't know what would be the right soultion, but perhaps we can do
something like the patch below. Allows to abort the coredump with kill -9.

Oleg.

--- fs/binfmt_elf.c~    2007-10-25 16:22:10.000000000 +0400
+++ fs/binfmt_elf.c     2007-11-29 14:47:43.000000000 +0300
@@ -1178,6 +1178,9 @@ out:
  */
 static int dump_write(struct file *file, const void *addr, int nr)
 {
+       if (sigismember(&current->signal->shared_pending.signal, SIGKILL))
+               return 0;
+
        return file->f_op->write(file, addr, nr, &file->f_pos) == nr;
 }
 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to