On 2024/10/26 6:03, Kirill Reshke wrote:
when the REJECT LIMIT is set to some non-zero number and the number of row NULL replacements exceeds the limit, is it OK to fail. Because there WAS errors, and we should not tolerate more than $limit errors . I do find this behavior to be consistent.
+1
But what if we don't set a REJECT LIMIT, it is sane to do all replacements, as if REJECT LIMIT is inf.
+1
But our REJECT LIMIT is zero (not set). So, we ignore zero REJECT LIMIT if set_to_null is set.
REJECT_LIMIT currently has to be greater than zero, so it won’t ever be zero.
But while I was trying to implement that, I realized that I don't understand v4 of this patch. My misunderstanding is about `t_on_error_null` tests. We are allowed to insert a NULL value for the first column of t_on_error_null using COPY ON_ERROR SET_TO_NULL. Why do we do that? My thought is we should try to execute InputFunctionCallSafe with NULL value (i mean, here [1]) for the column after we failed to insert the input value. And, if this second call is successful, we do replacement, otherwise we count the row as erroneous.
Your concern is valid. Allowing NULL to be stored in a column with a NOT NULL constraint via COPY ON_ERROR=SET_TO_NULL does seem unexpected. As you suggested, NULL values set by SET_TO_NULL should probably be re-evaluated.
Hm, good catch. Applied almost as you suggested. I did tweak this "replace columns with invalid input values with " into "replace columns containing erroneous input values with". Is that OK?
Yes, sounds good. Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION