On Wed, May 07, 2025 at 11:40:14AM +0300, Nikita Malakhov wrote: > Michael, what do you think of this approach (extending varatt_external) > vs extending varatt itself by new tag and structure?
I'm reserved on that. What I'm afraid here is more complications in the backend code because we have quite a few places where we do varatt lookups to decide what should be happen, like in PLs, so this brings complications of its own for something that could be isolated behind a varattrib_4b, where detoasting is under control. The patch posted at [1] means that the custom area could be anything, how do you make sure that the backend is able to understand what could be anything? I guess that this also depends on the pluggable toast part, of course, but I've not studied enough what's been proposed to have a hard opinion. If you have very specific pointers, please feel free. > The second approach > allows more flexibility, independence of existing structure without > modifying > varatt_4b and is extensible further. I mentioned it above (extending > the TOAST pointer), and it could be implemented more easily and in a less > confusing way. If you mean [0], putting an "extended" flag into ToastCompressionId which is something used now by the internals of TOAST for a compression method, with ToastCompressionId being limited to have up to 4 elements in its enum, does not feel right. In concept, once extended, this may point to something more than a compression method, as there's also metadata around the compression method added. At least that's what I'm understanding as a possible scenario from all the proposals in this area. There's some overlap with common/compression.h, for example, even if we are never going to care about gzip in this case, just saying that this has been buzzing me in the core code for some time. One first thing I'd try to do here is to untangle this situation, by allowing ToastCompressionId to have more extensibility so as we could use it to track more compression methods, or just perhaps remove it entirely in a smart way by keeping the information related to the extra byte and the two bits of va_tcinfo for the compression method isolated in varatt.h, shaping the code so as adding more compression methods in the extra byte put after va_tcinfo would be easier once the surroundings of varattrib_4b are extended. Without an agreement about how to use the last bit we have, there's perhaps little point in aiming for any of that now. FWIW, extending the area around varattrib_4b feels a natural thing to do here, and it does not have to overlap with the possibilities around the varatts. > I'm +1 on storing dictionary somewhere around actual data (not necessary > in the data storage area itself) but strongly against new catalog table > with dictionaries - it involves a lot of side effects, including locks > while working > with this table resulting in performance degradation, and so on. Just wondering. Have you looked at the potential overhead of doing computation and decomputation of a dictionnary? zstd mentions in its docs that these can easily cause a lot of overhead, hence handling this stuff without some kind of caching is going to be costly if performing a lot of chunk decompressions. It's something that could be decided later on, of course. If this area of the code is made pluggable, then it's up to an extension to just do it. [0]: https://www.postgresql.org/message-id/flat/CAN-LCVMq2X%3Dfhx7KLxfeDyb3P%2BBXuCkHC0g%3D9GF%2BJD4izfVa0Q%40mail.gmail.com [1]: https://www.postgresql.org/message-id/CAN-LCVNxbnpHh4PVUUc9g6dPibE8wZALiLtxcs3TjfivxDkCkA%40mail.gmail.com -- Michael
signature.asc
Description: PGP signature