Hi, In <CAD21AoAY_h-9nuhs14e3cyO_A2rH7==zuq+nphkn9ggwyax...@mail.gmail.com> "Re: Make COPY format extendable: Extract COPY TO format implementations" on Fri, 9 May 2025 21:29:23 -0700, Masahiko Sawada <sawada.m...@gmail.com> wrote:
>> > So the idea is that the backend process sets the format ID somewhere >> > in st_progress_param, and then the progress view calls a SQL function, >> > say pg_stat_get_copy_format_name(), with the format ID that returns >> > the corresponding format name. >> >> Does it work when we use session_preload_libraries or the >> LOAD command? If we have 2 sessions and both of them load >> "jsonlines" COPY FORMAT extensions, what will be happened? >> >> For example: >> >> 1. Session 1: Register "jsonlines" >> 2. Session 2: Register "jsonlines" >> (Should global format ID <-> format name mapping >> be updated?) >> 3. Session 2: Close this session. >> Unregister "jsonlines". >> (Can we unregister COPY FORMAT extension?) >> (Should global format ID <-> format name mapping >> be updated?) >> 4. Session 1: Close this session. >> Unregister "jsonlines". >> (Can we unregister COPY FORMAT extension?) >> (Should global format ID <-> format name mapping >> be updated?) > > I imagine that only for progress reporting purposes, I think session 1 > and 2 can have different format IDs for the same 'jsonlines' if they > load it by LOAD command. They can advertise the format IDs on the > shmem and we can also provide a SQL function for the progress view > that can get the format name by the format ID. > > Considering the possibility that we might want to use the format ID > also in the cumulative statistics, we might want to strictly provide > the unique format ID for each custom format as the format IDs are > serialized to the pgstat file. One possible way to implement it is > that we manage the custom format IDs in a wiki page like we do for > custom cumulative statistics and custom RMGR[1][2]. That is, a custom > format extension registers the format name along with the format ID > that is pre-registered in the wiki page or the format ID (e.g. 128) > indicating under development. If either the format name or format ID > conflict with an already registered custom format extension, the > registration function raises an error. And we preallocate enough > format IDs for built-in formats. > > As for unregistration, I think that even if we provide an > unregisteration API, it ultimately depends on whether or not custom > format extensions call it in _PG_fini(). Thanks for sharing your idea. With the former ID issuing approach, it seems that we need a global format ID <-> name mapping and a per session registered format name list. The custom COPY FORMAT register function rejects the same format name, right? If we support both of shared_preload_libraries and session_preload_libraries/LOAD, we have different life time custom formats. It may introduce a complexity with the ID issuing approach. With the latter static ID approach, how to implement a function that converts format ID to format name? PostgreSQL itself doesn't know ID <-> name mapping in the Wiki page. It seems that custom COPY FORMAT implementation needs to register its name to PostgreSQL by itself. Thanks, -- kou