On Wed, 24 Apr 2013 17:52:34 +0200 Nicolas Schichan <nschic...@freebox.fr> wrote:
> > btw, what on earth is going on with seccomp_jit_free()? It does > > disturbing undocumented typecasting and it punts the module_free into a > > kernel thread for mysterious, undocumented and possibly buggy reasons. > > > > I realize it just copies bpf_jit_free(). The same observations apply there. > > The reason for this hack for both seccomp filters and socket filters is that > {seccomp,bpf}_jit_free are called from a softirq. module_free() cannot be > called directly from softirq, as it will in turn call vfree() which will > BUG_ON() if in_interrupt() is non zero. So to call module_free(), it is > therefore required to be in a process context, which is provided by the work > struct. Well let's explain this to the next sucker who comes along. From: Andrew Morton <a...@linux-foundation.org> Subject: bpf: add comments explaining the schedule_work() operation Cc: Will Drewry <w...@chromium.org> Cc: Mircea Gherzan <mgher...@gmail.com> Cc: Nicolas Schichan <nschic...@freebox.fr> Cc: Russell King <li...@arm.linux.org.uk> Cc: "David S. Miller" <da...@davemloft.net> Cc: Daniel Borkmann <daniel.borkm...@tik.ee.ethz.ch> Signed-off-by: Andrew Morton <a...@linux-foundation.org> --- arch/arm/net/bpf_jit_32.c | 8 ++++++++ arch/powerpc/net/bpf_jit_comp.c | 4 ++++ arch/s390/net/bpf_jit_comp.c | 4 ++++ arch/sparc/net/bpf_jit_comp.c | 4 ++++ arch/x86/net/bpf_jit_comp.c | 4 ++++ 5 files changed, 24 insertions(+) diff -puN arch/arm/net/bpf_jit_32.c~a arch/arm/net/bpf_jit_32.c --- a/arch/arm/net/bpf_jit_32.c~a +++ a/arch/arm/net/bpf_jit_32.c @@ -958,6 +958,10 @@ void bpf_jit_free(struct sk_filter *fp) struct work_struct *work; if (fp->bpf_func != sk_run_filter) { + /* + * bpf_jit_free() can be called from softirq; module_free() + * requires process context. + */ work = (struct work_struct *)fp->bpf_func; INIT_WORK(work, bpf_jit_free_worker); @@ -985,6 +989,10 @@ void seccomp_jit_free(struct seccomp_fil void *bpf_func = seccomp_filter_get_bpf_func(fp); if (bpf_func != sk_run_filter) { + /* + * seccomp_jit_free() can be called from softirq; module_free() + * requires process context. + */ work = (struct work_struct *)bpf_func; INIT_WORK(work, bpf_jit_free_worker); diff -puN arch/sparc/net/bpf_jit_comp.c~a arch/sparc/net/bpf_jit_comp.c --- a/arch/sparc/net/bpf_jit_comp.c~a +++ a/arch/sparc/net/bpf_jit_comp.c @@ -817,6 +817,10 @@ static void jit_free_defer(struct work_s void bpf_jit_free(struct sk_filter *fp) { if (fp->bpf_func != sk_run_filter) { + /* + * bpf_jit_free() can be called from softirq; module_free() + * requires process context. + */ struct work_struct *work = (struct work_struct *)fp->bpf_func; INIT_WORK(work, jit_free_defer); diff -puN arch/powerpc/net/bpf_jit_comp.c~a arch/powerpc/net/bpf_jit_comp.c --- a/arch/powerpc/net/bpf_jit_comp.c~a +++ a/arch/powerpc/net/bpf_jit_comp.c @@ -699,6 +699,10 @@ static void jit_free_defer(struct work_s void bpf_jit_free(struct sk_filter *fp) { if (fp->bpf_func != sk_run_filter) { + /* + * bpf_jit_free() can be called from softirq; module_free() + * requires process context. + */ struct work_struct *work = (struct work_struct *)fp->bpf_func; INIT_WORK(work, jit_free_defer); diff -puN arch/s390/net/bpf_jit_comp.c~a arch/s390/net/bpf_jit_comp.c --- a/arch/s390/net/bpf_jit_comp.c~a +++ a/arch/s390/net/bpf_jit_comp.c @@ -818,6 +818,10 @@ void bpf_jit_free(struct sk_filter *fp) if (fp->bpf_func == sk_run_filter) return; + /* + * bpf_jit_free() can be called from softirq; module_free() requires + * process context. + */ work = (struct work_struct *)fp->bpf_func; INIT_WORK(work, jit_free_defer); schedule_work(work); diff -puN arch/x86/net/bpf_jit_comp.c~a arch/x86/net/bpf_jit_comp.c --- a/arch/x86/net/bpf_jit_comp.c~a +++ a/arch/x86/net/bpf_jit_comp.c @@ -749,6 +749,10 @@ static void jit_free_defer(struct work_s void bpf_jit_free(struct sk_filter *fp) { if (fp->bpf_func != sk_run_filter) { + /* + * bpf_jit_free() can be called from softirq; module_free() + * requires process context. + */ struct work_struct *work = (struct work_struct *)fp->bpf_func; INIT_WORK(work, jit_free_defer); diff -puN include/linux/filter.h~a include/linux/filter.h diff -puN net/core/filter.c~a net/core/filter.c _ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/