On Wed, Jun 18, 2025 at 12:59 PM Sutou Kouhei <k...@clear-code.com> wrote: > > Hi, > > In <afc5hmzhu5ncp...@paquier.xyz> > "Re: Make COPY format extendable: Extract COPY TO format implementations" > on Tue, 17 Jun 2025 09:38:54 +0900, > Michael Paquier <mich...@paquier.xyz> wrote: > > > On Tue, Jun 17, 2025 at 08:50:37AM +0900, Sutou Kouhei wrote: > >> OK. I'll implement the initial version with this > >> design. (Allocating IDs local not shared.) > > > > Sounds good to me. Thanks Sutou-san! > > I've attached the v41 patch set that uses the C API approach > with local (not shared) COPY routine management. > > 0001: This is same as 0001 in the v40 patch set. It just > cleans up CopySource and CopyDest enums. > 0002: This is the initial version of this approach.
Thank you for updating the patches! > Here are some discussion points: > > 1. This provides 2 registration APIs > (RegisterCopy{From,To}Routine(name, routine)) instead of > 1 registration API (RegisterCopyFormat(name, > from_routine, to_routine)). > > It's for simple implementation and easy to extend without > breaking APIs in the future. (And some formats may > provide only FROM routine or TO routine.) > > Is this design acceptable? With the single registration API idea, we can register the custom format routine that supports only FROM routine using the API like: RegisterCopyRoutine("new-format", NewFormatFromRoutine, NULL); Compared to this approach, what points do you think having separate two registration APIs is preferable in terms of extendability and API compatibility? I think it would be rather confusing for example if each COPY TO routine and COPY FROM routine is registered by different extensions with the same format name. > FYI: RegisterCopy{From,To}Routine() uses the same logic > as RegisterExtensionExplainOption(). I'm concerned that the patch has duplicated logics for the registration of COPY FROM and COPY TO. > > 2. This allocates IDs internally but doesn't provide APIs > that get them. Because it's not needed for now. > > We can provide GetExplainExtensionId() like API when we > need it. > > Is this design acceptable? +1 > > 3. I want to register the built-in COPY {FROM,TO} routines > in the PostgreSQL initialization phase. Where should we > do it? In 0002, it's done in InitPostgres() but I'm not > sure whether it's a suitable location or not. InitPostgres() is not a correct function as it's a process initialization function. Probably we don't necessarily need to register the built-in formats in the same way as custom formats. A simple solution would be to have separate arrays for built-in formats and custom formats and have the GetCopy[To|From]Routine() search both arrays (built-in array first). > 4. 0002 adds CopyFormatOptions::routine as union: > > @@ -87,9 +91,14 @@ typedef struct CopyFormatOptions > CopyLogVerbosityChoice log_verbosity; /* verbosity of logged > messages */ > int64 reject_limit; /* maximum tolerable number of > errors */ > List *convert_select; /* list of column names (can be NIL) */ > + union > + { > + const struct CopyFromRoutine *from; /* for COPY FROM */ > + const struct CopyToRoutine *to; /* for COPY TO */ > + } routine; /* routine to > process the specified format */ > } CopyFormatOptions; > > Because one of Copy{From,To}Routine is only needed at > once. Is this union usage strange in PostgreSQL? I think we can live with having two fields as there are other options that are used only in either COPY FROM or COPY TO. > > 5. 0002 adds InitializeCopy{From,To}Routines() and > GetCopy{From,To}Routine() that aren't used by COPY > {FROM,TO} routine implementations to copyapi.h. Should we > move them to other .h? If so, which .h should be used for > them? As I commented at 3, I think it's better to avoid dynamically registering the built-in formats. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com