On Wed, Apr 17, 2019 at 2:04 AM Andres Freund <and...@anarazel.de> wrote: > > Hi, > > I'm somewhat unhappy in how much the no-fsm-for-small-rels exposed > complexity that looks like it should be purely in freespacemap.c to > callers.
I took a stab at untying the free space code from any knowledge about heaps, and made it the responsibility of each access method that calls these free space routines to specify their own threshold (possibly zero). The attached applies on top of HEAD, because it's independent of the relcache logic being discussed. If I can get that working, the I'll rebase it on top of this API, if you like. > extern Size GetRecordedFreeSpace(Relation rel, BlockNumber heapBlk); > -extern BlockNumber GetPageWithFreeSpace(Relation rel, Size spaceNeeded); > +extern BlockNumber GetPageWithFreeSpace(Relation rel, Size spaceNeeded, > + bool check_fsm_only); > > So now freespace.c has an argument that says we should only check the > fsm. That's confusing. And it's not explained to callers what that > argument means, and when it should be set. I split this up into 2 routines: GetPageWithFreeSpace() is now exactly like it is in v11, and GetAlternatePage() is available for access methods that can use it. > +/* Only create the FSM if the heap has greater than this many blocks */ > +#define HEAP_FSM_CREATION_THRESHOLD 4 > > Hm, this seems to be tying freespace.c closer to heap than I think is > great - think of new AMs like zheap, that also want to use it. This was a bit harder than expected. Because of the pg_upgrade optimization, it was impossible to put this symbol in hio.h or heapam.h, because they include things unsafe for frontend code. I decided to create heap_fe.h, which is a hack. Also, because they have freespace.c callers, I put other thresholds in src/backend/storage/freespace/indexfsm.c src/include/access/brin_pageops.h Putting the thresholds in 3 files with completely different purposes is a mess, and serves no example for future access methods, but I don't have a better idea. On the upside, untying free space from heap allowed me to remove most of the checks for (rel->rd_rel->relkind == RELKIND_RELATION || rel->rd_rel->relkind == RELKIND_TOASTVALUE) except for the one in pg_upgrade.c, which is again a bit of a hack, but not bad. > Hm, given realistic HEAP_FSM_CREATION_THRESHOLD, and the fact that we > really only need one bit per relation, it seems like map should really > be just a uint32 with one bit per page. Done. Regarding the idea upthread about using bytes to store ordinary freespace values, I think that's better for correctness, but also makes it more difficult to use different thresholds per access method. > +static bool > +fsm_allow_writes(Relation rel, BlockNumber heapblk, > + BlockNumber nblocks, BlockNumber *get_nblocks) > > + RelationOpenSmgr(rel); > + if (smgrexists(rel->rd_smgr, FSM_FORKNUM)) > + return true; > > Isn't this like really expensive? mdexists() closes the relations and > reopens it from scratch. Shouldn't we at the very least check > smgr_fsm_nblocks beforehand, so this is only done once? I did this in an earlier patch above -- do you have an opinion about that? I also removed the call to smgrnblocks(smgr, MAIN_FORKNUM) from XLogRecordPageWithFreeSpace() because I don't think it's actually needed. There's a window where a table could have 5 blocks, but trying to record space on, say, block 2 won't actually create the FSM on the standby. When block 5 fills up enough, then the xlog call will cause the FSM to be created. Not sure if this is best, but it saves another syscall, and this function is only called when freespace is less than 20%, IIRC. -- John Naylor https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
untie-local-map-from-heap_v1.patch
Description: Binary data