On Fri, Apr 30, 2021 at 5:22 PM Peter Geoghegan <p...@bowt.ie> wrote: > I strongly suspect that index-organized tables (or indirect indexes, > or anything else that assumes that TID-like identifiers map directly > to logical rows as opposed to physical versions) are going to break > too many assumptions to ever be tractable. Assuming I have that right, > it would advance the discussion if we could all agree on that being a > non-goal for the tableam interface in general.
I *emphatically* disagree with the idea of ruling such things out categorically. This is just as naive as the TODO's statement that we do not want "All backends running as threads in a single process". Does anyone really believe that we don't want that any more? I believed it 10 years ago, but not any more. It's costing us very substantially not only in that in makes parallel query more complicated and fragile, but more importantly in that we can't scale up to connection counts that other databases can handle because we use up too many operating system resources. Support threading in PostgreSQL isn't a project that someone will pull off over a long weekend and it's not something that has to be done tomorrow, but it's pretty clearly the future. So here. The complexity of getting a table AM that does anything non-trivial working is formidable, and I don't expect it to happen right away. Picking one that is essentially block-based and can use 48-bit TIDs is very likely the right initial target because that's the closest we have now, and there's no sense attacking the hardest variant of the problem first. However, as with the threads-vs-processes example, I strongly suspect that having only one table AM is leaving vast amounts of performance on the table. To say that we're never going to pursue the parts of that space that require a different kind of tuple identifier is to permanently write off tons of ideas that have produced promising results in other systems. Let's not do that. -- Robert Haas EDB: http://www.enterprisedb.com