On 24/07/2019 22:49, Edward Cree wrote: > One thing that's causing me some uncertainty: busy_poll_stop() does a > napi->poll(), which can potentially gro_normal_one() something. But > when I tried to put a gro_normal_list() just after that, I ran into > list corruption because it could race against the one in > napi_complete_done(). Turns out that the bh_disables are a red herring, we're racing against a napi poll on a different CPU. I *think* that the sequence of events is - enter busy_poll_stop - enter napi->poll - napi->poll calls napi_complete_done(), which schedules a new napi - that new napi runs, on another CPU - meanwhile, busy_poll_stop returns from napi->poll; if it then does a gro_normal_list it can race with the one on the other CPU from the new napi-poll. Which means that... > Questions that arise from that: > 1) Is it safe to potentially be adding to the rx_list (gro_normal_one(), > which in theory can end up calling gro_normal_list() as well) within > busy_poll_stop()? I haven't ever seen a splat from that, but it seems > every bit as possible as what I have been seeing. ... this isn't a problem, because napi->poll() will do all of its gro_normal_one()s before it calls napi_complete_done().
Just gathering some final performance numbers, then I'll post the updated patches (hopefully) later tonight. -Ed