We are seeing a crash in the TCP ACK codepath often in our regression
racks with an ARM64 device with 4.19 based kernel.
It appears that the tp->highest_ack is invalid when being accessed when
a
FIN-ACK is received. In all the instances of the crash, the tcp socket
is in TCP_FIN_WAIT1 state.
[include/net/tcp.h]
static inline u32 tcp_highest_sack_seq(struct tcp_sock *tp)
{
if (!tp->sacked_out)
return tp->snd_una;
if (tp->highest_sack == NULL)
return tp->snd_nxt;
return TCP_SKB_CB(tp->highest_sack)->seq;
}
[net/ipv4/tcp_input.c]
static int tcp_ack(struct sock *sk, const struct sk_buff *skb, int flag)
{
...
prior_fack = tcp_is_sack(tp) ? tcp_highest_sack_seq(tp) : tp->snd_una;
Crash call stack below-
16496.596106: <6> Unable to handle kernel paging request at virtual
address fffffff2cd81a368
16496.730771: <2> pc : tcp_ack+0x174/0x11e8
16496.734536: <2> lr : tcp_rcv_state_process+0x318/0x1300
16497.183109: <2> Call trace:
16497.183114: <2> tcp_ack+0x174/0x11e8
16497.183115: <2> tcp_rcv_state_process+0x318/0x1300
16497.183117: <2> tcp_v4_do_rcv+0x1a8/0x1f0
16497.183118: <2> tcp_v4_rcv+0xe90/0xec8
16497.183120: <2> ip_protocol_deliver_rcu+0x150/0x298
16497.183121: <2> ip_local_deliver+0x21c/0x2a8
16497.183122: <2> ip_rcv+0x1c4/0x210
16497.183124: <2> __netif_receive_skb_core+0xab0/0xd90
16497.183125: <2> netif_receive_skb_internal+0x12c/0x368
16497.183126: <2> napi_gro_receive+0x1e0/0x290
Is it expected for the tp->highest_ack to be
accessed in this state?
--
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project