Bonjour Michaƫl,

FWIW, I was looking forward to putting my hands on this patch and try
to get it merged so as we can get rid of those duplications.  Here are
some comments.

+#ifdef FRONTEND
+       fprintf(stderr,
+               "invalid input syntax for type %s: \"%s\"\n",
"bigint", str);
+#else
+       ereport(ERROR,
+               (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
+                errmsg("invalid input syntax for type %s: \"%s\"",
+                       "bigint", str)));
+#endif
Have you looked at using the wrapper pg_log_error() here?

I have not.

I have simply merged the two implementations (pgbench & backend) as they were.

+extern bool pg_scanint8(const char *str, bool errorOK, int64 *result);
+extern uint64 pg_strtouint64(const char *str, char **endptr, int base);

Hmm.  With this patch we have strtoint and pg_strtouint64, which makes
the whole set inconsistent.

I understand that you mean bits vs bytes? Indeed it can bite!

+
#endif                         /* COMMON_STRING_H */
Noise diff..

Indeed.

numutils.c also has pg_strtoint16 and pg_strtoint32, so the locations
become rather inconsistent with inconsistent APIs for the manipulation
of int2 and int4 fields, and scanint8 is just a derivative of the same
logic.  We have two categories of routines here:

Yep, but the int2/4 functions are not used elsewhere.

- The wrappers on top of strtol and strtoul & co, which are named
respectively strtoint and pg_strtouint64 with the patch.  The naming
part is inconsistent, and we only handle uint64 and int32.  We don't
need to worry about int64 and uint32 because they are not used?

Indeed, it seems that they are not needed/used by client code, AFAICT.

That's fine by me, but at least let's have a consistent naming.

Ok.

Prefixing the functions with pg_* is a better practice in my opinion
as we will unlikely run into conflicts this way.

Ok.

- The str->integer conversion routines, which actually have very
similar characteristics to the strtol families as they remove trailing
whitespaces first, check for a sign, etc, except that they work only
on base 10.  And here we get into a state where pg_scanint8 should be
actually called pg_strtoint64,

I just removed that:-)

ISTM that the issue is that the error handling of these functions is pretty different.

with an interface inconsistent with its int32/int16 relatives now only in the backend.

We can, but I'm not at ease with changing the error handling approach.

Could we consider more consolidation here please? Like moving the whole set to src/common/?

My initial plan was simply to remove direct code duplications between front-end and back-end, not to re-engineer the full set of string to int conversion functions:-)

On the re-engineering front: Given the various points on the thread, ISTM that there should probably be two behaviors for str to signed/unsigned int{16,32,64}, and having only one kind of signature for all types would be definitely better.

One low-level one that does the conversion or return an error.

Another higher-level one which possibly adds an error message (stderr for front-end, log for back-end).

One choice is whether there are two functions (the higher one calling the lower one and adding the messages) or just one with a boolean to trigger the messages. I do not have a strong opinion. Maybe one function would be simpler. Andres boolean-compatible enum return looks like a good option.

Overall, this leads to something like:

enum { STRTOINT_OK, STRTOINT_OVERFLOW_ERROR, STRTOINT_SYNTAX_ERROR }
  pg_strto{,u}int{8?,16,32,64}
    (const char * string, const TYPE * result, const bool verbose);

--
Fabien.

Reply via email to