On Sat, Jun 6, 2020 at 5:08 PM Zé Loff <[email protected]> wrote:

> On Sat, Jun 06, 2020 at 03:51:58PM -0700, Jordan Geoghegan wrote:
> > I'm working on a simple awk snippet to convert the IP range data listed
> in
> > the Extended Delegation Statistics data from ARIN [1] and convert it into
> > CIDR blocks. I have a snippet that works perfectly fine on mawk and gawk,
> > but not on the base system awk. I'm 99% sure I'm not using any GNUisms,
> as
> > when I break the command up into two parts, it works perfectly.
> >
> > The snippet below does not work with base awk, but does work with gawk
> and
> > mawk: (Running on 6.6 -stable system)
> >
> >   awk -F '|' '{ if ( $3 == "ipv4" && $2 == "US") printf("%s/%d\n", $4,
> > 32-log($5)/log(2))}' delegated-arin-extended-latest.txt
> >
> >
> > The command does output data, but it also throws errors for certain
> lines:
> >
> >   awk: log result out of range
> >   input record number 94027, file delegated-arin-extended-latest.txt
> >   source line number 1
> >
> > Most CIDR blocks are calculated correctly, but about 10% of them have
> errors
> > (ie something that should calculated to be a /24 is instead calculated
> to be
> > a /30).
>
...

> I have no idea about what is going on, but FWIW I can reproduce this on
> i386 6.7-stable and amd64 6.7-current (well, current-ish, #232).
> Truncating the file to a single offending line produces the same result:
> log($5) is out of range.
>
> It appears to have something to do with the last field.  Removing it or
> changing some of its characters seems to work, e.g.:
>
>
> arin|US|ipv4|216.250.144.0|4096|20050503|allocated|5e58386636aa775c2106140445cf2c30
>
> arin|US|ipv4|216.250.144.0|4096|20050503|allocated|5a58386636aa775c2106140445cf2c30
>                                                     ^
> Fails on the first line but works on the second.
>

Hah!  Nice observation!

The last field of the first line looks kinda like a number in scientific
notation, but when awk internally tries to set up the fields it generates
an ERANGE error...and the global errno variable is left with that value.
Several builtins in awk, including log(), perform operations and then check
whether errno is set to EDOM or ERANGE but fail to clear errno beforehand.

The fix is to zero errno before all the code sequences that use the
errcheck() function, ala:

--- run.c       13 Aug 2019 10:45:56 -0000      1.44
+++ run.c       7 Jun 2020 03:14:38 -0000
@@ -26,6 +26,7 @@ THIS SOFTWARE.
 #define DEBUG
 #include <stdio.h>
 #include <ctype.h>
+#include <errno.h>
 #include <setjmp.h>
 #include <limits.h>
 #include <math.h>
@@ -1041,8 +1042,10 @@ Cell *arith(Node **a, int n)     /* a[0] + a
        case POWER:
                if (j >= 0 && modf(j, &v) == 0.0)       /* pos integer
exponent */
                        i = ipow(i, (int) j);
-               else
+               else {
+                       errno = 0;
                        i = errcheck(pow(i, j), "pow");
+               }
                break;
        default:        /* can't happen */
                FATAL("illegal arithmetic operator %d", n);
@@ -1135,8 +1138,10 @@ Cell *assign(Node **a, int n)    /* a[0] =
        case POWEQ:
                if (yf >= 0 && modf(yf, &v) == 0.0)     /* pos integer
exponent */
                        xf = ipow(xf, (int) yf);
-               else
+               else {
+                       errno = 0;
                        xf = errcheck(pow(xf, yf), "pow");
+               }
                break;
        default:
                FATAL("illegal assignment operator %d", n);
@@ -1499,12 +1504,15 @@ Cell *bltin(Node **a, int n)    /* builtin
                        u = strlen(getsval(x));
                break;
        case FLOG:
+               errno = 0;
                u = errcheck(log(getfval(x)), "log"); break;
        case FINT:
                modf(getfval(x), &u); break;
        case FEXP:
+               errno = 0;
                u = errcheck(exp(getfval(x)), "exp"); break;
        case FSQRT:
+               errno = 0;
                u = errcheck(sqrt(getfval(x)), "sqrt"); break;
        case FSIN:
                u = sin(getfval(x)); break;


Todd, are we up to date with upstream, or is this latent there too?


Philip Guenther

Reply via email to