On Mon, Dec 18, 2023 at 4:41 PM jian he <jian.universal...@gmail.com> wrote: > > On Mon, Dec 18, 2023 at 1:09 PM torikoshia <torikos...@oss.nttdata.com> wrote: > > > > Hi, > > > > > save the error metadata to system catalogs would be more expensive, > > > please see below explanation. > > > I have no knowledge of publications. > > > but i feel there is a feature request: publication FOR ALL TABLES > > > exclude regex_pattern. > > > Anyway, that would be another topic. > > > > I think saving error metadata to system catalog is not a good idea, too. > > And I believe Sawada-san just pointed out missing features and did not > > suggested that we use system catalog. > > > > > I don't think "specify the maximum number of errors to tolerate > > > before raising an ERROR." is very useful.... > > > > That may be so. > > I imagine it's useful in some use case since some loading tools have > > such options. > > Anyway I agree it's not necessary for initial patch as mentioned in [1]. > > > > > I suppose we can specify an ERRORFILE directory. similar > > > implementation [2], demo in [3] > > > it will generate 2 files, one file shows the malform line content as > > > is, another file shows the error info. > > > > That may be a good option when considering "(2) logging errors to > > somewhere". > > > > What do you think about the proposal to develop these features in > > incrementally? > > > > I am more with tom's idea [1], that is when errors happen (data type > conversion only), do not fail, AND we save the error to a table. I > guess we can implement this logic together, only with a new COPY > option.
If we want only such a feature we need to implement it together (the patch could be split, though). But if some parts of the feature are useful for users as well, I'd recommend implementing it incrementally. That way, the patches can get small and it would be easy for reviewers and committers to review/commit them. > > imagine a case (it's not that contrived, imho), while conversion from > text to table's int, postgres isspace is different from the source > text file's isspace logic. > then all the lines are malformed. If we just say on error continue and > not save error meta info, the user is still confused which field has > the wrong data, then the user will probably try to incrementally test > which field contains malformed data. > > Since we need to save the error somewhere. > Everyone has the privilege to INSERT can do COPY. > I think we also need to handle the access privilege also. > So like I mentioned above, one copy_error error table hub, then > everyone can view/select their own copy failure record. The error table hub idea is still unclear to me. I assume that there are error tables at least on each database. And an error table can have error data that happened during COPY FROM, including malformed lines. Do the error tables grow without bounds and the users have to delete rows at some point? If so, who can do that? How can we achieve that the users can see only errored rows they generated? And the issue with logical replication also needs to be resolved. Anyway, if we go this direction, we need to discuss the overall design. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com