Hi all, This is a continuation of the above thread... >> > 4. In order to use WAL-logging each page must start with a standard 24 >> > byte PageHeaderData even if it is needless for storage itself. Not a >> > big deal though. Another (acutally documented) WAL-related limitation >> > is that only generic WAL can be used within extension. So unless >> > inserts are made in bulks it's going to require a lot of disk space to >> > accomodate logs and wide bandwith for replication. >> >> Not sure what to suggest. Either you should ignore this problem, or >> you should fix it.
I am working on an environment similar to the above extension(pg_cryogen which experiments pluggable storage api's) but don't have much knowledge on pg's logical replication.. Please suggest some approaches to support pg's logical replication for a table with a custom access method, which writes generic wal record. On Wed, 17 Aug 2022 at 19:04, Tomas Vondra <tomas.von...@2ndquadrant.com> wrote: > On Fri, Oct 18, 2019 at 03:25:05AM -0700, Andres Freund wrote: > >Hi, > > > >On 2019-10-17 12:47:47 -0300, Alvaro Herrera wrote: > >> On 2019-Oct-10, Ildar Musin wrote: > >> > >> > 1. Unlike FDW API, in pluggable storage API there are no routines like > >> > "begin modify table" and "end modify table" and there is no shared > >> > state between insert/update/delete calls. > >> > >> Hmm. I think adding a begin/end to modifytable is a reasonable thing to > >> do (it'd be a no-op for heap and zheap I guess). > > > >I'm fairly strongly against that. Adding two additional "virtual" > >function calls for something that's rarely going to be used, seems like > >adding too much overhead to me. > > > > That seems a bit strange to me. Sure - if there's an alternative way to > achieve the desired behavior (clear way to finalize writes etc.), then > cool, let's do that. But forcing people to use invonvenient workarounds > seems like a bad thing to me - having a convenient and clear API is > quite valueable, IMHO. > > Let's see if this actually has a measuerable overhead first. > > > > >> > 2. It looks like I cannot implement custom storage options. E.g. for > >> > compressed storage it makes sense to implement different compression > >> > methods (lz4, zstd etc.) and corresponding options (like compression > >> > level). But as i can see storage options (like fillfactor etc) are > >> > hardcoded and are not extensible. Possible solution is to use GUCs > >> > which would work but is not extremely convinient. > >> > >> Yeah, the reloptions module is undergoing some changes. I expect that > >> there will be a way to extend reloptions from an extension, at the end > >> of that set of patches. > > > >Cool. > > > > Yep. > > > > >> > 3. A bit surprising limitation that in order to use bitmap scan the > >> > maximum number of tuples per page must not exceed 291 due to > >> > MAX_TUPLES_PER_PAGE macro in tidbitmap.c which is calculated based on > >> > 8kb page size. In case of 1mb page this restriction feels really > >> > limiting. > >> > >> I suppose this is a hardcoded limit that needs to be fixed by patching > >> core as we make table AM more pervasive. > > > >That's not unproblematic - a dynamic limit would make a number of > >computations more expensive, and we already spend plenty CPU cycles > >building the tid bitmap. And we'd waste plenty of memory just having all > >that space for the worst case. ISTM that we "just" need to replace the > >TID bitmap with some tree like structure. > > > > I think the zedstore has roughly the same problem, and Heikki mentioned > some possible solutions to dealing with it in his pgconfeu talk (and it > was discussed in the zedstore thread, I think). > > > > >> > 4. In order to use WAL-logging each page must start with a standard 24 > >> > byte PageHeaderData even if it is needless for storage itself. Not a > >> > big deal though. Another (acutally documented) WAL-related limitation > >> > is that only generic WAL can be used within extension. So unless > >> > inserts are made in bulks it's going to require a lot of disk space to > >> > accomodate logs and wide bandwith for replication. > >> > >> Not sure what to suggest. Either you should ignore this problem, or > >> you should fix it. > > > >I think if it becomes a problem you should ask for an rmgr ID to use for > >your extension, which we encode and then then allow to set the relevant > >rmgr callbacks for that rmgr id at startup. But you should obviously > >first develop the WAL logging etc, and make sure it's beneficial over > >generic wal logging for your case. > > > > AFAIK compressed/columnar engines generally implement two types of > storage - write-optimized store (WOS) and read-optimized store (ROS), > where the WOS is mostly just an uncompressed append-only buffer, and ROS > is compressed etc. ISTM the WOS would benefit from a more elaborate WAL > logging, but ROS should be mostly fine with the generic WAL logging. > > But yeah, we should test and measure how beneficial that actually is. > > > regards > > -- > Tomas Vondra http://www.2ndQuadrant.com > PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services > > > > >