hello Cong, thanks for looking at this! On Mon, 2021-04-19 at 11:46 -0700, Cong Wang wrote: > On Mon, Apr 19, 2021 at 8:24 AM Davide Caratti <dcara...@redhat.com> wrote: > > diff --git a/net/sched/sch_frag.c b/net/sched/sch_frag.c > > index e1e77d3fb6c0..8c06381391d6 100644 > > --- a/net/sched/sch_frag.c > > +++ b/net/sched/sch_frag.c > > @@ -90,16 +90,16 @@ static int sch_fragment(struct net *net, struct sk_buff > > *skb, > > } > > > > if (skb_protocol(skb, true) == htons(ETH_P_IP)) { > > - struct dst_entry sch_frag_dst; > > + struct rtable sch_frag_rt = { 0 }; > > Is setting these fields 0 sufficient here? Because normally a struct table > is initialized by rt_dst_alloc() which sets several of them to non-zero, > notably, rt->rt_type and rt->rt_uncached. > > Similar for the IPv6 part, which is initialized by rt6_info_init().
for what we do now in openvswitch and sch_frag, that should be sufficient: a similar thing is done by br_netfilter [1], apparently for the same "refragmentation" purposes. On a fedora host (running 5.10, but it shouldn't be much different than current Linux), I just dumped 'fake_rtable' from a bridge device: # ip link add name test-br0 type bridge # ip link set dev test-br0 up # ip link add name test-port0 type dummy # ip link set dev test-port0 master test-br0 up # crash [...] crash> net NET_DEVICE NAME IP ADDRESS(ES) [...] ffff89fb44ed8000 test-br0 ffff89fbfc45c000 test-port0 crash> p sizeof(struct net_device) $12 = 3200 crash> p ((struct net_bridge*)(0xffff89fb44ed8000 + 3200))->fake_rtable $13 = { dst = { dev = 0xffff89fb44ed8000, ops = 0xffffffffc0afef40, _metrics = 18446744072647256257, expires = 0, xfrm = 0x0, input = 0x0, output = 0x0, flags = 18, <-- that should be DST_NOXFRM | DST_FAKE_RTABLE obsolete = 0, header_len = 0, trailer_len = 0, __refcnt = { counter = 1 }, __use = 0, lastuse = 0, lwtstate = 0x0, callback_head = { next = 0x0, func = 0x0 }, error = 0, __pad = 0, tclassid = 0 }, rt_genid = 0, rt_flags = 0, rt_type = 0, rt_is_input = 0 '\000', rt_uses_gateway = 0 '\000', rt_iif = 0, rt_gw_family = 0 '\000', { rt_gw4 = 0, rt_gw6 = { in6_u = { u6_addr8 = "\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000", u6_addr16 = {0, 0, 0, 0, 0, 0, 0, 0}, u6_addr32 = {0, 0, 0, 0} } } }, rt_mtu_locked = 0, rt_pmtu = 0, rt_uncached = { next = 0x0, prev = 0x0 }, rt_uncached_list = 0x0 } only fake_rtable.dst members are set to something, the remaining are all zero-ed. -- davide [1] https://elixir.bootlin.com/linux/latest/source/net/bridge/br_nf_core.c#L62