Hi,

Tom, see below - I wonder if should provide one more piece of infrastructure
around the saved error stuff...


Have you measured whether this has negative performance effects when *NOT*
using the new option?


As-is this does not work with FORMAT BINARY - and converting the binary input
functions to support soft errors won't happen for 16. So I think you need to
raise an error if BINARY and IGNORE_DATATYPE_ERRORS are specified.


On 2023-03-22 22:34:20 +0900, torikoshia wrote:
> @@ -985,9 +986,28 @@ CopyFrom(CopyFromState cstate)
>  
>               ExecClearTuple(myslot);
>  
> +             if (cstate->opts.ignore_datatype_errors)
> +             {
> +                     escontext.details_wanted = true;
> +                     cstate->escontext = escontext;
> +             }

I think it might be worth pulling this out of the loop. That does mean you'd
have to reset escontext.error_occurred after an error, but that doesn't seem
too bad, you need to do other cleanup anyway.


> @@ -956,10 +957,20 @@ NextCopyFrom(CopyFromState cstate, ExprContext 
> *econtext,
>                               values[m] = ExecEvalExpr(defexprs[m], econtext, 
> &nulls[m]);
>                       }
>                       else
> -                             values[m] = InputFunctionCall(&in_functions[m],
> -                                                                             
>           string,
> -                                                                             
>           typioparams[m],
> -                                                                             
>           att->atttypmod);
> +                             /* If IGNORE_DATATYPE_ERRORS is enabled skip 
> rows with datatype errors */
> +                             if (!InputFunctionCallSafe(&in_functions[m],
> +                                                                             
>    string,
> +                                                                             
>    typioparams[m],
> +                                                                             
>    att->atttypmod,
> +                                                                             
>    (Node *) &cstate->escontext,
> +                                                                             
>    &values[m]))
> +                             {
> +                                     cstate->ignored_errors++;
> +
> +                                     ereport(WARNING,
> +                                                     errmsg("%s", 
> cstate->escontext.error_data->message));

That isn't right - you loose all the details of the message. As is you'd also
leak the error context.

I think the best bet for now is to do something like
    /* adjust elevel so we don't jump out */
    cstate->escontext.error_data->elevel = WARNING;
    /* despite the name, this won't raise an error if elevel < ERROR */
    ThrowErrorData(cstate->escontext.error_data);

I wonder if we ought to provide a wrapper for this? It could e.g. know to
mention the original elevel and such?


I don't think NextCopyFrom() is the right place to emit this warning - it
e.g. is also called from file_fdw.c, which might want to do something else
with the error. From a layering POV it seems cleaner to do this in
CopyFrom(). You already have a check for escontext.error_occurred there
anyway.



> @@ -3378,6 +3378,10 @@ copy_opt_item:
>                               {
>                                       $$ = makeDefElem("freeze", (Node *) 
> makeBoolean(true), @1);
>                               }
> +                     | IGNORE_DATATYPE_ERRORS
> +                             {
> +                                     $$ = 
> makeDefElem("ignore_datatype_errors", (Node *)makeBoolean(true), @1);
> +                             }
>                       | DELIMITER opt_as Sconst
>                               {
>                                       $$ = makeDefElem("delimiter", (Node *) 
> makeString($3), @1);


I think we shouldn't add a new keyword for this, but only support this via
/* new COPY option syntax */
copy_generic_opt_list:
                        copy_generic_opt_elem

Further increasing the size of the grammar with random keywords when we have
more generic ways to represent them seems unnecessary.


> +-- tests for IGNORE_DATATYPE_ERRORS option
> +CREATE TABLE check_ign_err (n int, m int[], k int);
> +COPY check_ign_err FROM STDIN WITH IGNORE_DATATYPE_ERRORS;
> +1    {1}     1
> +a    {2}     2
> +3    {3}     3333333333
> +4    {a, 4}  4
> +
> +5    {5}     5
> +\.
> +SELECT * FROM check_ign_err;
> +

I suggest adding a few more tests:

- COPY with a datatype error that can't be handled as a soft error
- test documenting that COPY FORMAT BINARY is incompatible with 
IGNORE_DATATYPE_ERRORS
- a soft error showing the error context - although that will require some
  care to avoid the function name + line in the output

Greetings,

Andres Freund


Reply via email to