Hi, On Thu, Jul 18, 2024 at 6:38 PM torikoshia <torikos...@oss.nttdata.com> wrote: > > On 2024-07-05 00:27, torikoshia wrote: > > Hi, > > > > With the current file_fdw, if even one line of data conversion fails, > > the contents of the file cannot be referenced at all: > > > > =# \! cat data/test.data > > 1,a > > 2,b > > a,c > > =# create foreign table f_fdw_test_1 (i int, t text) server f_fdw > > options (filename 'test.data', format 'csv'); > > CREATE FOREIGN TABLE > > > > =# table f_fdw_test_1; > > ERROR: invalid input syntax for type integer: "a" > > CONTEXT: COPY f_fdw_test, line 3, column i: "a" > > > > Since we'll support ON_ERROR option which tolerates data conversion > > errors in COPY FROM and LOG_VERBOSITY option at v17[1], how about > > supporting them on file_fdw?
+1 > > > > This idea comes from Fujii-san[2], and I think it'd be useful when > > reading a bit dirty data. > > > > Attached PoC patch works like below: > > > > =# create foreign table f_fdw_test_2 (i int, t text) server f_fdw > > options (filename 'test.data', format 'csv', on_error 'ignore'); > > CREATE FOREIGN TABLE > > > > =# table f_fdw_test_2; > > NOTICE: 1 row was skipped due to data type incompatibility > > i | t > > ---+--- > > 1 | a > > 2 | b > > (2 rows) I'm slightly concerned that users might not want to see the NOTICE message for every scan. Unlike COPY FROM, scanning a file via file_fdw could be frequent. An alternative idea of place to write the information of the number of malformed rows would be the EXPLAIN command as follow: QUERY PLAN ---------------------------------------------------------------- Foreign Scan on public.test (cost=0.00..1.10 rows=1 width=12) Output: a, b, c Foreign File: test.csv Foreign File Size: 12 b Skipped Rows: 10 > > > > > > =# create foreign table f_fdw_test_3 (i int, t text) server f_fdw > > options (filename 'test.data', format 'csv', on_error 'ignore', > > log_verbosity 'verbose'); > > CREATE FOREIGN TABLE > > > > =# table f_fdw_test_3 ; > > NOTICE: skipping row due to data type incompatibility at line 3 for > > column i: "a" > > NOTICE: 1 row was skipped due to data type incompatibility > > i | t > > ---+--- > > 1 | a > > 2 | b > > (2 rows) IIUC we have to execute ALTER FOREIGN TABLE to change the log_verbosity value and which requires to be the owner. Which seems not to be user-friendly. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com