On 7/10/2021 04:52, Stefan Esser wrote:
Am 10.07.21 um 10:23 schrieb Stefan Esser:
Am 10.07.21 um 04:41 schrieb Karl Denninger:
Ok, so I have good news and bad news.

I have the trap and it is definitely in libalias which appears to come about as
a result of a NAT translation attempt.

Fatal trap 18: integer divide fault while in kernel mode
[...]
HouseKeeping() at HouseKeeping+0x1c/frame 0xfffffe0017b6b320
The divide by zero at one of the first instructions of HouseKeeping()
seems to be caused by this line:

/sys/netinet/libalias/alias_db.c:1753:

         if (packets % packet_limit == 0) {

Seems that packet_limit can become zero, there ...

At line 1780 within that function:

                if (now != LibAliasTime) {
                         /* retry three times a second */
                         packet_limit = packets / 3;
                         packets = 0;
                         LibAliasTime = now;
                 }

The static variable packet limit is divided by 3 without any
protection against going down to 0.

A packet_limit of zero makes no sense (besides causing a divide
by zero abort), therefore this value should probably have a lower
limit of 1.

Maybe that
                         packet_limit = packets / 3 + 1;

would give an acceptably close result in all cases.

Else enforce a minimum value of 1 after the division:

                         packet_limit = packets / 3;
                         if (packet_limit == 0)
                                 packet_limit = 1;
Or just:
                         packet_limit = packets >= 3 ? packets / 3 : 1;

Regards, STefan
I have just noticed that enforcing a lower limit of 1 is totally
equivalent to testing for zero before performing the modulo operation.

The attached patch should fix the panic and does not change the way
packet_limit is calculated. Since the variable is immediately used
in the modulo when not zero, the additional cost of the test for zero
is extremely low, less than that of the other suggested changes.

Maybe that increasing packet_limit by 1 is sensible, anyway, since at
low packet rates it will result in 0 to 5 packets giving the same
effect (the condition in line 1753 evaluates to true).

Anyway, please try the attached patch, which will fix the divide by
zero panic.

Regards, STefan

PS: Patch inline in case it is stripped by the mail-list:

diff --git a/sys/netinet/libalias/alias_db.c b/sys/netinet/libalias/alias_db.c
index c09ad4352ce4..d5dec0709cbe 100644
--- a/sys/netinet/libalias/alias_db.c
+++ b/sys/netinet/libalias/alias_db.c
@@ -1769,7 +1769,7 @@ HouseKeeping(struct libalias *la)
          * Reduce the amount of house keeping work substantially by
          * sampling over the packets.
          */
-       if (packets % packet_limit == 0) {
+       if (packet_limit == 0 || packets % packet_limit == 0) {
                 time_t now;

  #ifdef _KERNEL


(Line numbers from -CURRENT, may be slightly off for stable/12.)
Compiling now; I have a roughly hour-long window before a blackout period where I can't have that system crashing until late afternoon.  If I can get it loaded before then will advise but yeah, what you identified would certainly do it if packet_limit became zero......
--
Karl Denninger
k...@denninger.net <mailto:k...@denninger.net>
/The Market Ticker/
/[S/MIME encrypted email preferred]/

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature

Reply via email to