Re: system panics now & then

2024-01-19 Thread Jo Geraerts
Hello On 2023-12-06 10:32, Alexander Bluhm wrote: Jo Geraerts: Please test anyway so we know whether it fixes your bug. The issue did not occur anymore. Thanks. Kr, Jo

Re: system panics now & then

2023-12-06 Thread Jo Geraerts
On 2023-12-06 10:32, Alexander Bluhm wrote: Jo Geraerts: Please test anyway so we know whether it fixes your bug. bluhm Your original patch is already applied and compiled, will apply the ipv6 this evening as well and then report back after a month or so. Kr,

Re: system panics now & then

2023-12-06 Thread Alexander Bluhm
On Wed, Dec 06, 2023 at 10:17:26AM +0100, Claudio Jeker wrote: > On Wed, Dec 06, 2023 at 12:57:57AM +0100, Alexander Bluhm wrote: > > On Wed, Dec 06, 2023 at 01:39:40AM +0300, Vitaliy Makkoveev wrote: > > > > Diff makes sense in any case. > > > > > > > > > > Just checked, socket6_send() is identi

Re: system panics now & then

2023-12-06 Thread Claudio Jeker
On Wed, Dec 06, 2023 at 12:57:57AM +0100, Alexander Bluhm wrote: > On Wed, Dec 06, 2023 at 01:39:40AM +0300, Vitaliy Makkoveev wrote: > > > Diff makes sense in any case. > > > > > > > Just checked, socket6_send() is identical to socket_send() and needs > > to be reworked in the same way. > > New

Re: system panics now & then

2023-12-05 Thread Vitaliy Makkoveev
> On 6 Dec 2023, at 02:57, Alexander Bluhm wrote: > > On Wed, Dec 06, 2023 at 01:39:40AM +0300, Vitaliy Makkoveev wrote: >>> Diff makes sense in any case. >>> >> >> Just checked, socket6_send() is identical to socket_send() and needs >> to be reworked in the same way. > > New diff for v4 and v

Re: system panics now & then

2023-12-05 Thread Alexander Bluhm
On Wed, Dec 06, 2023 at 01:39:40AM +0300, Vitaliy Makkoveev wrote: > > Diff makes sense in any case. > > > > Just checked, socket6_send() is identical to socket_send() and needs > to be reworked in the same way. New diff for v4 and v6. The other callers seem to be correct. I will run this thro

Re: system panics now & then

2023-12-05 Thread Vitaliy Makkoveev
> On 6 Dec 2023, at 00:18, Vitaliy Makkoveev wrote: > >> On 5 Dec 2023, at 22:40, Alexander Bluhm wrote: >> >> On Tue, Dec 05, 2023 at 08:22:52PM +0100, Jo Geraerts wrote: >>> maybe its a good idea to just change 1 thing >> >> Yes, only change 1 thing. I just wrote down all my ideas. >>

Re: system panics now & then

2023-12-05 Thread Vitaliy Makkoveev
> On 5 Dec 2023, at 22:40, Alexander Bluhm wrote: > > On Tue, Dec 05, 2023 at 08:22:52PM +0100, Jo Geraerts wrote: >> maybe its a good idea to just change 1 thing > > Yes, only change 1 thing. I just wrote down all my ideas. > >>> It could be race or a single packet that crashes the machine. >

Re: system panics now & then

2023-12-05 Thread Jo Geraerts
On 2023-12-05 20:40, Alexander Bluhm wrote: Found a race when we insert the IGMP packet into the socket buffer. Unicast takes a mutex, but multicast code does not. Other than that, I suspect the issue was introduced in 7.3 because (iirc) I never ran into that issue before 7.3. The parallel rece

Re: system panics now & then

2023-12-05 Thread Jo Geraerts
On 2023-12-05 20:00, Alexander Bluhm wrote: netstat -na and netstat -g output would be useful. fstat -p helps to see for which sockets recvfrom(2) may be called. Sorry, forgot about these Active Internet connections (including servers) Proto Recv-Q Send-Q Local Address Foreign Ad

Re: system panics now & then

2023-12-05 Thread Alexander Bluhm
On Tue, Dec 05, 2023 at 08:22:52PM +0100, Jo Geraerts wrote: > maybe its a good idea to just change 1 thing Yes, only change 1 thing. I just wrote down all my ideas. > > It could be race or a single packet that crashes the machine. Found a race when we insert the IGMP packet into the socket buf

Re: system panics now & then

2023-12-05 Thread Jo Geraerts
On 2023-12-05 20:00, Alexander Bluhm wrote: *cpu0: receive 1: so 0xfd80259ea760, so_type 3, sb_cc 40 When the multicast routing daemon reads from raw socket, the kernel crashes. Kernel has no data in the queue, but counter says there should be 40 bytes. So it panics. What kind of multicas

Re: system panics now & then

2023-12-05 Thread Alexander Bluhm
On Tue, Dec 05, 2023 at 04:22:47PM +0100, Jo Geraerts wrote: > *92547 388831 1 0 7 0mrouted Cool, you are running a multicast router. Unfortunately this code path is not well tested. > *cpu0: receive 1: so 0xfd80259ea760, so_type 3, sb_cc 40 When the mu

Re: system panics now & then

2023-12-05 Thread xuser
I think is is multi core problems i have had the same prolem with linux and netbsd xu...@sdf.org SDF Public Access UNIX System - http://sdf.org On Tue, 5 Dec 2023, Jo Geraerts wrote: Synopsis: system used as router panics maybe once a month Category: system Environment: System :