On Sat, Jul 21, 2018 at 12:22:44PM +0200, Thomas Leuxner wrote:
> * Salvatore Bonaccorso <car...@debian.org> 2018.07.19 14:38:
>
> > Can you check if the attached patch fixes the issue for you?
> >
> >
https://kernel-team.pages.debian.net/kernel-handbook/ch-common-tasks.html#s4.2.2
> >
> > Regards,
> > Salvatore
>
> Thanks Salvatore. That regression fix did indeed fix it:
Hello,
I'm using the sit module to establish a 6in4 tunnel to connect to my
IPv6 tunnel broker.
Two days ago, I upgraded from 4.9.0-6-amd64 to 4.9.0-7-amd64 and started
experiencing random connection failures on IPv6 traffic. It looked like
the new kernel was dropping some packets received from the remote tunnel
endpoint (tcpdump saw them coming on the IPv4 interface but not on the
IPv6 tunnel interface, and they were not processed by the host nor
forwarded).
The link with the upgrade was confirmed since downgrading back to
4.9.0-6-amd64 made the issue disappear.
After further testing I made the assumption that the triggering factor
was a value of 1 for the third and forth bits of the IPv6 header Flow
Label field, which I then successfully confirmed by generating traffic
with arbitrary Flow Label values. 100% of the ping requests matching the
condition were dropped, and 100% of those not matching the condition
were answered.
I also noticed that everytime a packet was dropped, a line similar to
this one appeared in the dmesg:
sit: non-ECT from 74.232.98.121 with TOS=0x3
It happens that the offset of those two bits relative to the beginning
of the IPv6 header is the same as the offset of the ECN field relative
to the start of a IPv4 header. In addition, the reported IP and TOS
values were always matching the data at offset 11 and 1 of the
encapsulated IPv6 header, which happens to be the same offset than the
Source IP address and TOS fields in a IPv4 header.
In the light of this, I started suspecting that something inside the
kernel was mistakenly interpreting the inner IPv6 header as an IPv4
header and was probably the cause of the problem.
I then found this bug report after searching for the aforementioned
dmesg message. And it turned out that there is clearly a link with my
IPv6 issue: I was no longer reproducing with an kernel image built from
since 4.9.110-1 with David S. Miller patch applied (and without changing
anything else).
It looks like this bug has probably more severe consequences than only
dmesg spam.
Best regards,
Stephane Poignant