Re: POC PATCH: copy from ... exceptions to: (was Re: VLDB Features)

torikoshia Mon, 15 Aug 2022 05:29:31 -0700

On 2022-07-19 21:40, Damir Belyalov wrote:

Hi!


Improved my patch by adding block subtransactions.
The block size is determined by the REPLAY_BUFFER_SIZE parameter.
I used the idea of a buffer for accumulating tuples in it.
If we read REPLAY_BUFFER_SIZE rows without errors, the subtransaction
will be committed.
If we find an error, the subtransaction will rollback and the buffer
will be replayed containing tuples.


Thanks for working on this!

I tested 0002-COPY-IGNORE_ERRORS.patch and faced an unexpected behavior.

I loaded 10000 rows which contained 1 wrong row.
I expected I could see 9999 rows after COPY, but just saw 999 rows.

Since when I changed MAX_BUFFERED_TUPLES from 1000 to other values, thenumber of loaded rows also changed, I imagine MAX_BUFFERED_TUPLES mightbe giving influence of this behavior.


```sh
$ cat /tmp/test10000.dat

1   aaa
2   aaa
3   aaa
4   aaa
5   aaa
6   aaa
7   aaa
8   aaa
9   aaa
10  aaa
11  aaa
...
9994    aaa
9995    aaa
9996    aaa
9997    aaa
9998    aaa
9999    aaa
xxx aaa
```

```SQL
=# CREATE TABLE test (id int, data text);

=# COPY test FROM '/tmp/test10000.dat' WITH (IGNORE_ERRORS);
WARNING:  COPY test, line 10000, column i: "xxx"
COPY 9999

=# SELECT COUNT(*) FROM test;
 count
-------
   999
(1 row)
```

BTW I may be overlooking it, but have you submit this proposal to thenext CommitFest?


https://commitfest.postgresql.org/39/


--
Regards,

--
Atsushi Torikoshi
NTT DATA CORPORATION

Re: POC PATCH: copy from ... exceptions to: (was Re: VLDB Features)

Reply via email to