Hi,

In <CAD21AoBGRFStdVbHUcxL0QB8wn92J3Sn-6x=rhssmuheprh...@mail.gmail.com>
  "Re: Make COPY format extendable: Extract COPY TO format implementations" on 
Fri, 2 May 2025 21:38:32 -0700,
  Masahiko Sawada <sawada.m...@gmail.com> wrote:

>> How about requiring schema for all custom formats?
>>
>> Valid:
>>
>>   COPY ... TO ... (FORMAT 'text');
>>   COPY ... TO ... (FORMAT 'my_schema.jsonlines');
>>
>> Invalid:
>>
>>   COPY ... TO ... (FORMAT 'jsonlines'); -- no schema
>>   COPY ... TO ... (FORMAT 'pg_catalog.text'); -- needless schema
>>
>> If we require "schema" for all custom formats, we don't need
>> to depend on search_path.
> 
> I'm concerned that users cannot use the same format name in the FORMAT
> option depending on which schema the handler function is created.

I'm not sure that it's a problem or not. If users want to
use the same format name, they can install the handler
function to the same schema.

>> Why do we need to assign a unique ID? For performance? For
>> RegisterCustomCopyFormatOption()?
> 
> I think it's required for monitoring purposes for example. For
> instance, we can set the format ID in the progress information and the
> progress view can fetch the format name by the ID so that users can
> see what format is being used in the COPY command.

How about setting the format name instead of the format ID
in the progress information?

> I think we can skip the custom option patch for the first
> implementation but still need to discuss how we will be able to
> implement it to understand the big picture of this feature. Otherwise
> we could end up going the wrong direction.

I think that we don't need to discuss it deeply because we
have many options with this approach. We can call C
functions in _PG_Init(). I think that this feature will not
be a blocker of this approach.

>> (BTW, I think that it's not a good API because we want COPY
>> FROM only options and COPY TO only options something like
>> "compression level".)
> 
> Why does this matter in terms of API? I think that even with this API
> we can pass is_from to the option handler function so that it
> validates the option based on it.

If we choose the API, each custom format developer needs to
handle the case in handler function. For example, if we pass
information whether this option is only for TO to
PostgreSQL, ProcessCopyOptions() not handler functions can
handle it.

Anyway, I think that we don't need to discuss this deeply
for now.

>> What contributes to the "flexibility"? Developers can call
>> multiple Register* functions in _PG_Init(), right?
> 
> I think that with a tablesample-like approach we need to do everything
> based on one handler function and callbacks returned from it whereas
> there is no such limitation with C API style.

Thanks for clarifying it. It seems that my understanding is
correct.

I hope that the flexibility is needed flexibility and too
much flexibility doesn't introduce too much complexity.


Thanks,
-- 
kou


Reply via email to