On Tue, Jun 24, 2025 at 2:11 PM Sutou Kouhei <k...@clear-code.com> wrote:
>
> Hi,
>
> In <cad21aoa57owo6qytptxotcjdmcuj1tgl1ags95cvenolqvw...@mail.gmail.com>
>   "Re: Make COPY format extendable: Extract COPY TO format implementations" 
> on Tue, 24 Jun 2025 11:59:17 +0900,
>   Masahiko Sawada <sawada.m...@gmail.com> wrote:
>
> >> 1. This provides 2 registration APIs
> >>    (RegisterCopy{From,To}Routine(name, routine)) instead of
> >>    1 registration API (RegisterCopyFormat(name,
> >>    from_routine, to_routine)).
> >>
> >>    It's for simple implementation and easy to extend without
> >>    breaking APIs in the future. (And some formats may
> >>    provide only FROM routine or TO routine.)
> >>
> >>    Is this design acceptable?
> >
> > With the single registration API idea, we can register the custom
> > format routine that supports only FROM routine using the API like:
> >
> > RegisterCopyRoutine("new-format", NewFormatFromRoutine, NULL);
> >
> > Compared to this approach, what points do you think having separate
> > two registration APIs is preferable in terms of extendability and API
> > compatibility?
>
> It's natural to add more related APIs with this
> approach. The single registration API provides one feature
> by one operation. If we use the RegisterCopyRoutine() for
> FROM and TO formats API, it's not natural that we add more
> related APIs. In this case, some APIs may provide multiple
> features by one operation and other APIs may provide single
> feature by one operation. Developers may be confused with
> the API. For example, developers may think "what does mean
> NULL here?" or "can we use NULL here?" for
> "RegisterCopyRoutine("new-format", NewFormatFromRoutine,
> NULL)".

We can document it in the comment for the registration function.

>
>
> >                I think it would be rather confusing for example if
> > each COPY TO routine and COPY FROM routine is registered by different
> > extensions with the same format name.
>
> Hmm, I don't think so. Who is confused by the case? DBA?
> Users who use COPY? Why is it confused?
>
> Do you assume the case that the same name is used for
> different format? For example, "json" is used for JSON Lines
> for COPY FROM and and normal JSON for COPY TO by different
> extensions.

Suppose that extension-A implements only CopyToRoutine for the
custom-format-X with the format name 'myformat' and extension-B
implements only CopyFromRoutine for the custom-format-Y with the same
name, if users load both extension-A and extension-B, it seems to me
that extension-A registers the custom-format-X format as 'myformat'
only with CopyToRoutine, and extension-B overwrites the 'myformat'
registration by adding custom-format-Y's CopyFromRoutine. However, if
users register extension-C that implements both routines with the
format name 'myformat', they can register neither extension-A nor
extension-B, which seems to me that we don't allow overwriting the
registration in this case.

I think the core issue appears to be the internal management of custom
format entries but the current patch does enable registration
overwriting in the former case (extension-A and extension-B case).

>
> >>    FYI: RegisterCopy{From,To}Routine() uses the same logic
> >>    as RegisterExtensionExplainOption().
> >
> > I'm concerned that the patch has duplicated logics for the
> > registration of COPY FROM and COPY TO.
>
> We can implement a convenient routine that can be used for
> RegisterExtensionExplainOption() and
> RegisterCopy{From,To}Routine() if it's needed.

I meant there are duplicated codes in COPY FROM and COPY TO. For
instance, RegisterCopyFromRoutine() and RegisterCopyToRoutine() have
the same logic.

>
> >> 3. I want to register the built-in COPY {FROM,TO} routines
> >>    in the PostgreSQL initialization phase. Where should we
> >>    do it? In 0002, it's done in InitPostgres() but I'm not
> >>    sure whether it's a suitable location or not.
> >
> > InitPostgres() is not a correct function as it's a process
> > initialization function. Probably we don't necessarily need to
> > register the built-in formats in the same way as custom formats. A
> > simple solution would be to have separate arrays for built-in formats
> > and custom formats and have the GetCopy[To|From]Routine() search both
> > arrays (built-in array first).
>
> We had a discussion that we should dog-food APIs:
>
> https://www.postgresql.org/message-id/flat/CAKFQuwaCHhrS%2BRE4p_OO6d7WEskd9b86-2cYcvChNkrP%2B7PJ7A%40mail.gmail.com#e6d1cdd04dac53eafe34b784ac47b68b
>
> > We should (and usually do) dog-food APIs when reasonable
> > and this situation seems quite reasonable.
>
> In this case, we don't need to dog-food APIs, right?

Yes, I think so.

Regards,

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com


Reply via email to