On Fri, Oct 13, 2017 at 3:09 PM, Alexei Starovoitov <alexei.starovoi...@gmail.com> wrote: > On Fri, Oct 13, 2017 at 01:50:44PM -0700, Brendan Gregg wrote: >> On Fri, Oct 13, 2017 at 1:03 PM, Cong Wang <xiyou.wangc...@gmail.com> wrote: >> > We need a real-time notification for tcp retransmission >> > for monitoring. >> > >> > Of course we could use ftrace to dynamically instrument this >> > kernel function too, however we can't retrieve the connection >> > information at the same time, for example perf-tools [1] reads >> > /proc/net/tcp for socket details, which is slow when we have >> > a lots of connections. >> > >> > Therefore, this patch adds a tracepoint for __tcp_retransmit_skb() >> > and exposes src/dst IP addresses and ports of the connection. >> > This also makes it easier to integrate into perf. >> > >> > Note, I expose both IPv4 and IPv6 addresses at the same time: >> > for a IPv4 socket, v4 mapped address is used as IPv6 addresses, >> > for a IPv6 socket, LOOPBACK4_IPV6 is already filled by kernel. >> > Also, add sk and skb pointers as they are useful for BPF. >> >> Thanks, a TCP retransmit tracepoint would be great. (tcp_set_state >> would be highly useful too, which Alexei already has in his list). >> >> Should skp->__sk_common.skc_state be included in the format string, so >> we don't have to always dig it out of the skaddr? For retransmits I >> always want to know the TCP state, to determine if it is ESTABLISHED >> (packet drop) or SYN_SENT (backlog full) or something else. > > let's not expose internal socket fields into tp fields. > Few people still believe that tp fields are abi, so to be safe > no such fields should be exposed. > It's trivial enough to read sk_state from bpf program > with bpf_probe_read().
Ah, right, the number mapping for TCP_ESTABLISHED and friends is a Linux implementation detail, and not from the RFCs. Ok, I can dig it from the skp instead. > >> We probably need a tracepoint for tcp_send_loss_probe() (TLP) as well, >> for tracing at the same time as retransmits (like my tools do), but >> that can be added later. > > hmm. why? > This single tracepoint will cover both cases of retransmits. I don't think tcp_send_loss_probe() TLP goes through __tcp_retransmit_skb(): look at the path to bumping LINUX_MIB_TCPLOSSPROBES. I was thinking that later on we might want to add a tcp:tcp_send_tlp tracepoint, in addition to this tcp:tcp_retransmit_skb tracepoint, for investigating the same kind of issues: packet loss. This existing tcp:tcp_retransmit_skb tracepoint patch is ok. Acked-by: Brendan Gregg <bgr...@netflix.com> (with or without %pI6c) Brendan