Re: refactoring - share str2*int64 functions

Andres Freund Mon, 09 Sep 2019 03:18:15 -0700

Hi,

On 2019-09-09 14:28:14 +0900, Michael Paquier wrote:
> On Thu, Sep 05, 2019 at 03:52:48PM +0900, Michael Paquier wrote:
> > Right, there was this part.  This brings also the point of having one
> > interface for the backend as all the error messages for the backend
> > are actually the same, with the most simple name being that:
> > pg_strtoint(value, size, error_ok).


I *VEHEMENTLY* oppose the introduction of any future pseudo-generic
routines taking the type width as a parameter. They're weakening the
type system and they're unnecessarily inefficient.


I don't really buy that saving a few copies of a strings is worth that
much. But if you really want to do that, the right approach imo would be
to move the error reporting into a separate function. I.e. something

void pg_attribute_noreturn()
pg_strtoint_error(pg_strtoint_status status, const char *input, const char 
*type)

that you'd call in small wrappers. Something like

static inline int32
pg_strtoint32_check(const char* s)
{
    int32 result;
    pg_strtoint_status status = pg_strtoint32(s, &result);

    if (unlikely(status == PG_STRTOINT_OK))
       pg_strtoint_error(status, s, "int32");
    return result;
}


> So, it seems to me that if we want to have a consolidation of
> everything, we basically need to have the generic error messages for
> the backend directly into common/string.c where the routines are
> refactored, and I think that we should do the following based on the
> patch attached:

> - Just remove pg_strtoint(), and move the error generation logic in
> common/string.c.

I'm not quite sure what you mean by moving the "error generation logic"?


> - Add an error_ok flag for all the pg_strto[u]int{16|32|64} routines,
> which generates errors only for FRONTEND is not defined.

I think this is a bad idea.


> - Use pg_log_error() for the frontend errors.
>
> If those errors are added everywhere, we would basically have no code
> paths in the frontend or the backend (for the unsigned part only)
> which use them yet.  Another possibility would be to ignore the
> error_ok flag in those cases, but that's inconsistent.

Yea, ignoring it would be terrible idea.

> diff --git a/src/backend/libpq/pqmq.c b/src/backend/libpq/pqmq.c
> index a9bd47d937..593a5ef54e 100644
> --- a/src/backend/libpq/pqmq.c
> +++ b/src/backend/libpq/pqmq.c
> @@ -286,10 +286,10 @@ pq_parse_errornotice(StringInfo msg, ErrorData *edata)
>                               edata->hint = pstrdup(value);
>                               break;
>                       case PG_DIAG_STATEMENT_POSITION:
> -                             edata->cursorpos = pg_strtoint32(value);
> +                             edata->cursorpos = pg_strtoint(value, 4);
>                               break;

I'd be really upset if this type of change went in.



>  #include "fmgr.h"
>  #include "miscadmin.h"
> +#include "common/string.h"
>  #include "nodes/extensible.h"
>  #include "nodes/parsenodes.h"
>  #include "nodes/plannodes.h"
> @@ -80,7 +81,7 @@
>  #define READ_UINT64_FIELD(fldname) \
>       token = pg_strtok(&length);             /* skip :fldname */ \
>       token = pg_strtok(&length);             /* get field value */ \
> -     local_node->fldname = pg_strtouint64(token, NULL, 10)
> +     (void) pg_strtouint64(token, &local_node->fldname)

Seems like these actually could just ought to use the error-checked
variants. And I think it ought to change all of
READ_{INT,UINT,LONG,UINT64,OID}_FIELD, rather than just redirecting one
of them to the new routines.


>  static void pcb_error_callback(void *arg);
> @@ -496,7 +496,7 @@ make_const(ParseState *pstate, Value *value, int location)
>
>               case T_Float:
>                       /* could be an oversize integer as well as a float ... 
> */
> -                     if (scanint8(strVal(value), true, &val64))
> +                     if (pg_strtoint64(strVal(value), &val64) == 
> PG_STRTOINT_OK)
>                       {
>                               /*
>                                * It might actually fit in int32. Probably 
> only INT_MIN can

Not for this change, but we really ought to move away from this crappy
logic. It's really bonkers to have T_Float represent large integers and
floats.



> +/*
> + * pg_strtoint16
> + *
> + * Convert input string to a signed 16-bit integer.  Allows any number of
> + * leading or trailing whitespace characters.
> + *
> + * NB: Accumulate input as a negative number, to deal with two's complement
> + * representation of the most negative number, which can't be represented as 
> a
> + * positive number.
> + *
> + * The function returns immediately if the conversion failed with a status
> + * value to let the caller handle the error.  On success, the result is
> + * stored in "*result".
> + */
> +pg_strtoint_status
> +pg_strtoint16(const char *s, int16 *result)
> +{
> +     const char *ptr = s;
> +     int16           tmp = 0;
> +     bool            neg = false;
> +
> +     /* skip leading spaces */
> +     while (likely(*ptr) && isspace((unsigned char) *ptr))
> +             ptr++;
> +
> +     /* handle sign */
> +     if (*ptr == '-')
> +     {
> +             ptr++;
> +             neg = true;
> +     }
> +     else if (*ptr == '+')
> +             ptr++;

> +     /* require at least one digit */
> +     if (unlikely(!isdigit((unsigned char) *ptr)))
> +             return PG_STRTOINT_SYNTAX_ERROR;

Wonder if there's an argument for moving this behaviour to someplace
else - in most cases we really don't expect whitespace, and checking for
it is unnecessary overhead.

Greetings,

Andres Freund

Re: refactoring - share str2*int64 functions

Reply via email to