Hello. At Wed, 18 Sep 2019 05:42:01 +0200, David Fetter <da...@fetter.org> wrote in <20190918034201.gx31...@fetter.org> > On Tue, Sep 17, 2019 at 09:01:57AM +0200, David Fetter wrote: > > On Tue, Sep 17, 2019 at 08:55:05AM +0200, David Fetter wrote: > > > On Sun, Sep 15, 2019 at 09:18:49AM +0200, David Fetter wrote: > > > > Folks, > > > > > > > > Please find attached a couple of patches intended to $subject. > > > > > > > > This patch set cut the time to copy ten million rows of randomly sized > > > > int8s (10 of them) by about a third, so at least for that case, it's > > > > pretty decent. > > > > > > Added int4 output, removed the sprintf stuff, as it didn't seem to > > > help in any cases I was testing. > > > > Found a couple of "whiles" that should have been "ifs." > > Factored out some inefficient functions and made the guts use the more > efficient function.
I'm not sure this is on the KISS principle, but looked it and have several random comments. +numutils.o: CFLAGS += $(PERMIT_DECLARATION_AFTER_STATEMENT) I don't think that we are allowing that as project coding policy. It seems to have been introduced only to accept external code as-is. - char str[23]; /* sign, 21 digits and '\0' */ + char str[MAXINT8LEN]; It's uneasy that MAXINT8LEN contains tailling NUL. MAXINT8BUFLEN can be so. I think MAXINT8LEN should be 20 and the definition should be str[MAXINT8LEN + 1]. +static const char DIGIT_TABLE[200] = { + '0', '0', '0', '1', '0', '2', '0', '3', '0', '4', '0', '5', '0', '6', '0', '7', '0', '8', '0', '9', Wouldn't it be simpler if it were defined as a constant string? static const char DIGIT_TABLE[201] = "000102030405....19" "202122232425....39" .. +pg_ltoa_n(int32 value, char *a) ... + /* Compute the result string. */ + while (value >= 100000000) We have only two degits above the value. Isn't the stuff inside the while a waste of cycles? + /* Expensive 64-bit division. Optimize? */ I believe compiler treats such trivial optimizations. (concretely converts into shifts and subtractons if faster.) + memcpy(a + olength - i - 2, DIGIT_TABLE + c0, 2); Maybe it'd be easy to read if 'a + olength - i' is a single variable. + i += adjust; + return i; If 'a + olength - i' is replaced with a variable, the return statement is replacable with "return olength + adjust;". + return t + (v >= PowersOfTen[t]); I think it's better that if it were 't - (v < POT[t]) + 1; /* log10(v) + 1 */'. At least we need an explanation of the difference. (I'didn't checked the algorithm is truely right, though.) > void > pg_lltoa(int64 value, char *a) > { .. > memcpy(a, "-9223372036854775808", 21); .. > memcpy(a, "0", 2); The lines need a comment like "/* length contains trailing '\0' */" + if (value >= 0) ... + else + { + if (value == PG_INT32_MIN) + memcpy(str, "-2147483648", 11); + return str + 11; > } + *str++ = '-'; + return pg_ltostr_zeropad(str, -value, minwidth - 1); If then block of the if statement were (values < 0), we won't need to reenter the functaion. + len = pg_ltoa_n(value, str); + if (minwidth <= len) + return str + len; + + memmove(str + minwidth - len, str, len); If the function had the parameters str with the room only for two digits and a NUL, 2 as minwidth but 1000 as value, the function would overrun the buffer. The original function just ignores overflowing digits. regards. -- Kyotaro Horiguchi NTT Open Source Software Center