On Mon, Jan 14, 2019 at 7:30 PM Eric W. Biederman <ebied...@xmission.com> wrote: > > zzoru <zzoru...@gmail.com> writes: > > > I think that it is exactly same to: > > https://groups.google.com/forum/#!searchin/linux.kernel/cleanup_net$20is$20slow%7Csort:date/linux.kernel/IMJ9OzonDSI/QH86oy1PAQAJ > > Already, patch was maded, but maybe he forgot to push it. > > That patch was made to address speed, and lifetime of network stack > objects. At best it will make things go faster (a good thing), and > reduce the memory consumption during a test (another good thing). > The patch you point to will not correct your memory corruption. > > So right now the best hypothesis seems to be Dmitriy's idea that > there is stack overflow causing corruption. You have a lot of stack > debugging already enabled but I don't see CONFIG_VMAP_STACK enabled > which might catch something ordinary stack overflow checking won't. > > Any chance you can enable CONFIG_VMAP_STACK and see if it is stack > overflow? > > With a little luck you will catch the stack overflow in the act and we > can see the problematic code path.
Most likely the stack overflow should be detectable with CONFIG_VMAP_STACK. But CONFIG_VMAP_STACK is incompatible with KASAN: https://bugzilla.kernel.org/show_bug.cgi?id=202009 I reproduced the other stack overflow without KASAN and without CONFIG_VMAP_STACK and it was detected as "corrupted stack end detected inside scheduler". We can try the same here. But without KASAN and with CONFIG_VMAP_STACK should be more reliable. But how I read it is if we see wb_workfn in stacks, kernel memory is corrupted. Overflow at that async stack is not dependent on how exactly low memory condition was provoked.