On Tue, Jan 5, 2016 at 3:14 PM, Eric Dumazet <eric.duma...@gmail.com> wrote: > On Tue, 2016-01-05 at 12:07 +0100, Jacob Siverskog wrote: >> On Mon, Jan 4, 2016 at 4:25 PM, Eric Dumazet <eric.duma...@gmail.com> wrote: >> > On Mon, 2016-01-04 at 10:10 +0100, Jacob Siverskog wrote: >> >> On Wed, Dec 30, 2015 at 11:30 PM, Cong Wang <xiyou.wangc...@gmail.com> >> >> wrote: >> >> > On Wed, Dec 30, 2015 at 6:30 AM, Jacob Siverskog >> >> > <jacob@teenage.engineering> wrote: >> >> >> On Wed, Dec 30, 2015 at 2:26 PM, Eric Dumazet <eduma...@google.com> >> >> >> wrote: >> >> >>> How often can you trigger this bug ? >> >> >> >> >> >> Ok. I don't have a good repro to trigger it unfortunately, I've seen >> >> >> it just a >> >> >> few times when bringing up/down network interfaces. Does the trace >> >> >> give any clue? >> >> >> >> >> > >> >> > A little bit. You need to help people to narrow down the problem >> >> > because there are too many places using skb->next and skb->prev. >> >> > >> >> > Since you mentioned it seems related to network interface flip, >> >> > what network interfaces are you using? What's is your TC setup? >> >> > >> >> > Thanks. >> >> >> >> The system contains only one physical network interface (TI WL1837, >> >> wl18xx module). >> >> The state prior to the crash was as follows: >> >> - One virtual network interface active (as STA, associated with access >> >> point) >> >> - Bluetooth (BLE only) active (same physical chip, co-existence, >> >> btwilink/st_drv modules) >> >> >> >> Actions made around the time of the crash: >> >> - Bluetooth disabled >> >> - One additional virtual network interface brought up (also as STA) >> >> >> >> I believe the crash occurred between these two actions. I just saw >> >> that there are some interesting events in the log prior to the crash: >> >> kernel: Bluetooth: Unable to push skb to HCI core(-6) >> >> kernel: (stc): proto stack 4's ->recv failed >> >> kernel: (stc): remove_channel_from_table: id 3 >> >> kernel: (stc): remove_channel_from_table: id 2 >> >> kernel: (stc): remove_channel_from_table: id 4 >> >> kernel: (stc): all chnl_ids unregistered >> >> kernel: (stk) :ldisc_install = 0(stc): st_tty_close >> >> >> >> The first print is from btwilink.c. However, I can't see the >> >> connection between Bluetooth (BLE) and UDP/IPv6 (we're not using >> >> 6LoWPAN or anything similar). >> >> >> >> Thanks, Jacob >> > >> > Definitely these details are useful ;) >> > >> > Could you try : >> > >> > diff --git a/drivers/misc/ti-st/st_core.c b/drivers/misc/ti-st/st_core.c >> > index 6e3af8b42cdd..0c99a74fb895 100644 >> > --- a/drivers/misc/ti-st/st_core.c >> > +++ b/drivers/misc/ti-st/st_core.c >> > @@ -912,7 +912,9 @@ void st_core_exit(struct st_data_s *st_gdata) >> > skb_queue_purge(&st_gdata->txq); >> > skb_queue_purge(&st_gdata->tx_waitq); >> > kfree_skb(st_gdata->rx_skb); >> > + st_gdata->rx_skb = NULL; >> > kfree_skb(st_gdata->tx_skb); >> > + st_gdata->tx_skb = NULL; >> > /* TTY ldisc cleanup */ >> > err = tty_unregister_ldisc(N_TI_WL); >> > if (err) >> > >> > >> >> Sure. Since I don't have a good way to trigger the initial issue, I >> can't really know if there is a difference with your patch. However, >> normal usage seems to work as expected with your patch. I've tried to >> reproduce the initial issue with and without your patch repeatedly for >> hours and have not seen any crash in any of the runs so far. >> -- > > You might build a kernel with KASAN support to get maybe more chances to > trigger the bug. > > ( https://www.kernel.org/doc/Documentation/kasan.txt ) >
Ah. Doesn't seem to be supported on arm(32) unfortunately. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html