Hi! Aleksander, >Don't you think that this is an arguable design decision? Basically >all we know about the underlying TableAM is that it stores tuples >_somehow_ and that tuples have TIDs [1]. That's it. We don't know if >it even has any sort of pages, whether they are fixed in size or not, >whether it uses shared buffers, etc. It may not even require TOAST. >(Not to mention the fact that when you have N TOAST implementations >and M TableAM implementations now you have to run N x M compatibility >tests. And this doesn't account for different versions of Ns and Ms, >different platforms and different versions of PostgreSQL.)
>I believe the proposed approach is architecturally broken from the beginning. Existing TOAST mechanics just works, but for certain types of data it does so very poorly, and, let's face it, this mechanics has very strict limitations that limit overall capabilities of DBMS, because TOAST was designed when today's usual amounts of data were not the case - I mean tables with hundreds of billions of rows, with sizes measured by hundreds of Gb and even by Terabytes. But TOAST itself is good solution to problem of storing oversized attributes, and though it has some limitations - it is unwise to just throw it away, better way is to make it up-to-date by revising it, get rid of the most painful limitations and allow to use different (custom) TOAST strategies for special cases. The main idea of Pluggable TOAST is to extend TOAST capabilities by providing common API allowing to uniformly use different strategies to TOAST different data. With the acronym "TOAST" I mean that data would be stored externally to source table, somewhere only its Toaster know where and how - it may be regular Heap tables, Heap tables with different table structure, some other AM tables, files outside of the database, even files on different storage systems. Pluggable TOAST allows using advanced compression methods and complex operations on externally stored data, like search without fully de-TOASTing data, etc. Also, existing TOAST is a part of Heap AM and is restricted to use Heap only. To make it extensible - we have to separate TOAST from Heap AM. Default TOAST in Pluggable TOAST still uses Heap, but Heap knows nothing about TOAST. It fits perfectly in OOP paradigms >It looks like the idea should be actually turned inside out. I.e. what >would be nice to have is some sort of _framework_ that helps TableAM >authors to implement TOAST (alternatively, the rest of the TableAM >except for TOAST) if the TableAM is similar to the default one. In >other words the idea is not to implement alternative TOASTers that >will work with all possible TableAMs but rather to simplify the task >of implementing an alternative TableAM which is similar to the default >one except for TOAST. These TableAMs should reuse as much common code >as possible except for the parts where they differ. To implement different TOAST strategies you must have an API to plug them in, otherwise for each strategy you'd have to change the core. TOAST API allows to plug in custom TOAST strategies just by adding contrib modules, once the API is merged into the core. I have to make a point that different TOAST strategies do not have to store data with other TAMs, they just could store these data in Heap but using knowledge of internal data structure of workflow to store them in a more optimal way - like fast and partially compressed and decompressed JSON, lots of large chunks of binary data stored in the database (as you know, largeobjects are not of much help with this) and so on. Implementing another Table AM just to implement another TOAST strategy seems too much, the TAM API is very heavy and complex, and you would have to add it as a contrib. Lots of different TAMs would cause much more problems than lots of Toasters because such a solution results in data incompatibility between installations with different TAMs and some minor changes in custom TAM contrib could lead to losing all data stored with this TAM, but with custom TOAST you (in the worst case) could lose just TOASTed data and nothing else. We have lots of requests from clients and tickets related to TOAST limitations and extending Postgres this way - this growing need made us develop Pluggable TOAST. On Sun, Oct 23, 2022 at 12:38 PM Aleksander Alekseev < aleksan...@timescale.com> wrote: > Hi Nikita, > > > Pluggable TOAST API was designed with storage flexibility in mind, and > Custom TOAST mechanics is > > free to use any storage methods > > Don't you think that this is an arguable design decision? Basically > all we know about the underlying TableAM is that it stores tuples > _somehow_ and that tuples have TIDs [1]. That's it. We don't know if > it even has any sort of pages, whether they are fixed in size or not, > whether it uses shared buffers, etc. It may not even require TOAST. > (Not to mention the fact that when you have N TOAST implementations > and M TableAM implementations now you have to run N x M compatibility > tests. And this doesn't account for different versions of Ns and Ms, > different platforms and different versions of PostgreSQL.) > > I believe the proposed approach is architecturally broken from the > beginning. > > It looks like the idea should be actually turned inside out. I.e. what > would be nice to have is some sort of _framework_ that helps TableAM > authors to implement TOAST (alternatively, the rest of the TableAM > except for TOAST) if the TableAM is similar to the default one. In > other words the idea is not to implement alternative TOASTers that > will work with all possible TableAMs but rather to simplify the task > of implementing an alternative TableAM which is similar to the default > one except for TOAST. These TableAMs should reuse as much common code > as possible except for the parts where they differ. > > Does it make sense? > > Sorry, I realize this will probably imply a complete rewrite of the > patch. This is the reason why one should start proposing changes from > gathering the requirements, writing an RFC and run it through several > rounds of discussion. > > [1]: https://www.postgresql.org/docs/current/tableam.html > > -- > Best regards, > Aleksander Alekseev > -- Regards, Nikita Malakhov Postgres Professional https://postgrespro.ru/