From: Kolmakov Dmitriy <kolmakov.dmit...@huawei.com> Date: Mon, 7 Sep 2015 09:05:48 +0000
> If an attempt to wake up users of broadcast link is made when there is > no enough place in send queue than it may hang up inside the > tipc_sk_rcv() function since the loop breaks only after the wake up > queue becomes empty. This can lead to complete CPU stall with the > following message generated by RCU: ... > The issue occurs only when tipc_sk_rcv() is used to wake up postponed > senders: ... > After the sender thread is woke up it can gather control and perform > an attempt to send a message. But if there is no enough place in send > queue it will call link_schedule_user() function which puts a message > of type SOCK_WAKEUP to the wakeup queue and put the sender to sleep. > Thus the size of the queue actually is not changed and the while() > loop never exits. > > The approach I proposed is to wake up only senders for which there is > enough place in send queue so the described issue can't occur. > Moreover the same approach is already used to wake up senders on > unicast links. > > I have got into the issue on our product code but to reproduce the > issue I changed a benchmark test application (from > tipcutils/demos/benchmark) to perform the following scenario: > 1. Run 64 instances of test application (nodes). It can be done > on the one physical machine. > 2. Each application connects to all other using TIPC sockets in > RDM mode. > 3. When setup is done all nodes start simultaneously send > broadcast messages. > 4. Everything hangs up. > > The issue is reproducible only when a congestion on broadcast link > occurs. For example, when there are only 8 nodes it works fine since > congestion doesn't occur. Send queue limit is 40 in my case (I use a > critical importance level) and when 64 nodes send a message at the > same moment a congestion occurs every time. > > Signed-off-by: Dmitry S Kolmakov <kolmakov.dmit...@huawei.com> > Reviewed-by: Jon Maloy <jon.ma...@ericsson.com> > Acked-by: Ying Xue <ying....@windriver.com> > --- > v2: Updated after comments from Jon and Ying. Applied, thanks. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html