On Fri, Dec 04, 2015 at 09:45:20AM -0800, Eric Dumazet wrote: > On Fri, 2015-12-04 at 18:01 +0100, Phil Sutter wrote: > > On Fri, Dec 04, 2015 at 10:39:56PM +0800, Herbert Xu wrote: > > > On Thu, Dec 03, 2015 at 08:08:39AM -0800, Eric Dumazet wrote: > > > > > > > > Anyway, __vmalloc() can be used with GFP_ATOMIC, have you tried this ? > > > > > > OK I've tried it and I no longer get any ENOMEM errors! > > > > I can't confirm this, sadly. Using 50 threads, results seem to be stable > > and good. But increasing the number of threads I can provoke ENOMEM > > condition again. See attached log which shows a failing test run with > > 100 threads. > > > > I tried to extract logs of a test run with as few as possible failing > > threads, but wasn't successful. It seems like the error amplifies > > itself: While having stable success with less than 70 threads, going > > beyond a margin I could not identify exactly, much more threads failed > > than expected. For instance, the attached log shows 70 out of 100 > > threads failing, while for me every single test with 50 threads was > > successful. > > But this patch is about GFP_ATOMIC allocations, I doubt your test is > using GFP_ATOMIC. > > Threads (process context) should use GFP_KERNEL allocations.
Well, I assumed Herbert did his tests using test_rhashtable, and therefore fixed whatever code-path that triggers. Maybe I'm wrong, though. Looking at the vmalloc allocation failure trace, it seems like it's trying to indeed use GFP_ATOMIC from inside those threads: If I don't miss anything, bucket_table_alloc is called from rhashtable_insert_rehash, which passes GFP_ATOMIC unconditionally. But then again bucket_table_alloc should use kzalloc if 'gfp != GFP_KERNEL', so I'm probably just cross-eyed right now. > BTW, if 100 threads are simultaneously trying to vmalloc(32 MB), this > might not be very wise :( > > Only one should really do this, while others are waiting. Sure, that was my previous understanding of how this thing works. > If we really want parallelism (multiple cpus coordinating their effort), > it should be done very differently. Maybe my approach of stress-testing rhashtable was too naive in the first place. Thanks, Phil -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/