On 2022-07-19 21:40, Damir Belyalov wrote:
Hi!
Improved my patch by adding block subtransactions.
The block size is determined by the REPLAY_BUFFER_SIZE parameter.
I used the idea of a buffer for accumulating tuples in it.
If we read REPLAY_BUFFER_SIZE rows without errors, the subtransaction
will be committed.
If we find an error, the subtransaction will rollback and the buffer
will be replayed containing tuples.
Thanks for working on this!
I tested 0002-COPY-IGNORE_ERRORS.patch and faced an unexpected behavior.
I loaded 10000 rows which contained 1 wrong row.
I expected I could see 9999 rows after COPY, but just saw 999 rows.
Since when I changed MAX_BUFFERED_TUPLES from 1000 to other values, the
number of loaded rows also changed, I imagine MAX_BUFFERED_TUPLES might
be giving influence of this behavior.
```sh
$ cat /tmp/test10000.dat
1 aaa
2 aaa
3 aaa
4 aaa
5 aaa
6 aaa
7 aaa
8 aaa
9 aaa
10 aaa
11 aaa
...
9994 aaa
9995 aaa
9996 aaa
9997 aaa
9998 aaa
9999 aaa
xxx aaa
```
```SQL
=# CREATE TABLE test (id int, data text);
=# COPY test FROM '/tmp/test10000.dat' WITH (IGNORE_ERRORS);
WARNING: COPY test, line 10000, column i: "xxx"
COPY 9999
=# SELECT COUNT(*) FROM test;
count
-------
999
(1 row)
```
BTW I may be overlooking it, but have you submit this proposal to the
next CommitFest?
https://commitfest.postgresql.org/39/
--
Regards,
--
Atsushi Torikoshi
NTT DATA CORPORATION