Hi I find it caused by below steps: 1. set tp_version to TPACKET_V3 and req->tp_block_nr to 1 2. set tp_block_nr to 0 Then pg_vec was freed, and we did not delete the timer?
Best Regards, liujian > -----Original Message----- > From: liujian (CE) > Sent: Sunday, July 23, 2017 5:47 PM > To: liujian (CE); 'Cong Wang'; Dingtianhong > Cc: 'Willem de Bruijn'; 'Dave Jones'; 'alexander.le...@verizon.com'; > 'da...@davemloft.net'; 'eduma...@google.com'; 'will...@google.com'; > 'dan...@iogearbox.net'; 'netdev@vger.kernel.org'; > 'linux-ker...@vger.kernel.org' > Subject: RE: af_packet: use after free in prb_retire_rx_blk_timer_expired > > Hi, > > Do we need delete the v3 ring, when tp_version changed from TPACKET_V3 to > TPACKET_V1 ? > > > Best Regards, > liujian > > > > -----Original Message----- > > From: liujian (CE) > > Sent: Sunday, July 23, 2017 4:21 PM > > To: 'Cong Wang'; Dingtianhong > > Cc: Willem de Bruijn; Dave Jones; alexander.le...@verizon.com; > > da...@davemloft.net; eduma...@google.com; will...@google.com; > > dan...@iogearbox.net; netdev@vger.kernel.org; > > linux-ker...@vger.kernel.org > > Subject: RE: af_packet: use after free in > > prb_retire_rx_blk_timer_expired > > > > Hi Wang Cong, > > > > With this patch , the system was crashed when setsockopt. > > > > The call trace as below: > > > > crash> bt > > PID: 3069 TASK: ffff8800afcc0000 CPU: 0 COMMAND: "trinity-main" > > #0 [ffff8801bec03ce0] machine_kexec at ffffffff8105354b > > #1 [ffff8801bec03d40] crash_kexec at ffffffff810f7e82 > > #2 [ffff8801bec03e10] panic at ffffffff81650058 > > #3 [ffff8801bec03e90] watchdog_timer_fn at ffffffff81122533 > > #4 [ffff8801bec03ec8] __hrtimer_run_queues at ffffffff810abeb2 > > #5 [ffff8801bec03f20] hrtimer_interrupt at ffffffff810ac450 > > #6 [ffff8801bec03f70] local_apic_timer_interrupt at ffffffff8104a457 > > #7 [ffff8801bec03f88] smp_apic_timer_interrupt at ffffffff8166aed0 > > #8 [ffff8801bec03fb0] apic_timer_interrupt at ffffffff8166931d > > --- <IRQ stack> --- > > #9 [ffff8801b301fcb8] apic_timer_interrupt at ffffffff8166931d > > [exception RIP: lock_timer_base+77] > > RIP: ffffffff8108dced RSP: ffff8801b301fd60 RFLAGS: 00000246 > > RAX: 0000000000000000 RBX: ffff8800afcc0000 RCX: > > 0000000000000001 > > RDX: ffff8800afcc0000 RSI: ffff8801b301fd90 RDI: ffff8800b0d853c8 > > RBP: ffff8801b301fd80 R8: ffff8800afcc0000 R9: > ffffea0002680000 > > R10: 000000000000003c R11: ffff8801b301fb2e R12: > ffff8800afcc0000 > > R13: ffff8800afcc0000 R14: 0000000000000000 R15: > ffffffff83d1a340 > > ORIG_RAX: ffffffffffffff10 CS: 0010 SS: 0018 > > #10 [ffff8801b301fd88] try_to_del_timer_sync at ffffffff8108f19f > > #11 [ffff8801b301fdb8] del_timer_sync at ffffffff8108f252 > > #12 [ffff8801b301fdd0] packet_set_ring at ffffffff81635e60 > > #13 [ffff8801b301fe98] packet_setsockopt at ffffffff81636760 > > #14 [ffff8801b301ff38] sys_setsockopt at ffffffff81531860 > > #15 [ffff8801b301ff80] tracesys at ffffffff816687ed (via system_call) > > RIP: 00007fcc78b03e3a RSP: 00007fff16f246b8 RFLAGS: 00000202 > > RAX: ffffffffffffffda RBX: ffffffff816687ed RCX: ffffffffffffffff > > RDX: 0000000000000005 RSI: 0000000000000107 RDI: > > 0000000000000180 > > RBP: 0000000000000180 R8: 000000000000001c R9: > > 00007fcc78dc7160 > > R10: 0000000001fd6ba0 R11: 0000000000000202 R12: > > 0000000000000000 > > R13: 0000000000000011 R14: 0000000001fd6b60 R15: > > 0000000001fd6b70 > > ORIG_RAX: 0000000000000036 CS: 0033 SS: 002b > > > > > > Best Regards, > > liujian > > > > > > > -----Original Message----- > > > From: Cong Wang [mailto:xiyou.wangc...@gmail.com] > > > Sent: Sunday, July 23, 2017 1:59 PM > > > To: Dingtianhong > > > Cc: liujian (CE); Willem de Bruijn; Dave Jones; > > > alexander.le...@verizon.com; da...@davemloft.net; > > eduma...@google.com; > > > will...@google.com; dan...@iogearbox.net; netdev@vger.kernel.org; > > > linux-ker...@vger.kernel.org > > > Subject: Re: af_packet: use after free in > > > prb_retire_rx_blk_timer_expired > > > > > > On Sat, Jul 22, 2017 at 8:40 PM, Ding Tianhong > > > <dingtianh...@huawei.com> > > > wrote: > > > > Hi, Cong: > > > > > > > > Thanks for your quirk solution, but I still has some doubts about > > > > it, it looks like fix the problem in the > > > > packet_setsockopt->packet_set_ring processing, but when in > > > > packet_release processing, it may could not release the real > > > > pg_vec for the TPACKET_V3 ring, and then cause the mem leak, maybe > > > > I miss something here, nice to hear from your feedback. :) > > > > > > Yes you miss that packet_release() has memset()'s so we won't hit > > > that path. :) > > > > > > However, I missed the swap() in this messy function, actually I > > > believe the bug is that we modify tpacket_kbdq_core inside rx_ring > > > in non-closing case without actually stopping its timer. I feel more > > > confident > > with the following patch: > > > > > > > > > diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c index > > > 008bb34ee324..267b181fef15 100644 > > > --- a/net/packet/af_packet.c > > > +++ b/net/packet/af_packet.c > > > @@ -4263,6 +4263,7 @@ static int packet_set_ring(struct sock *sk, > > > union tpacket_req_u *req_u, > > > case TPACKET_V3: > > > /* Block transmit is not supported yet */ > > > if (!tx_ring) { > > > + prb_shutdown_retire_blk_timer(po, > > > + rb_queue); > > > init_prb_bdqc(po, rb, pg_vec, > req_u); > > > } else { > > > struct tpacket_req3 *req3 = > > > &req_u->req3;