On Wed, Mar 20, 2024 at 3:48 PM John Naylor <johncnaylo...@gmail.com> wrote: > > On Thu, Mar 14, 2024 at 12:06 PM Masahiko Sawada <sawada.m...@gmail.com> > wrote: > > > > On Thu, Mar 14, 2024 at 1:29 PM John Naylor <johncnaylo...@gmail.com> wrote: > > > Locally (not CI), we should try big inputs to make sure we can > > > actually go up to many GB -- it's easier and faster this way than > > > having vacuum give us a large data set. > > > > I'll do these tests. > > I just remembered this -- did any of this kind of testing happen? I > can do it as well.
I forgot to report the results. Yes, I did some tests where I inserted many TIDs to make the tidstore use several GB memory. I did two cases: 1. insert 100M blocks of TIDs with an offset of 100. 2. insert 10M blocks of TIDs with an offset of 2048. The tidstore used about 4.8GB and 5.2GB, respectively, and all lookup and iteration results were expected. > > > Thank you. I've incorporated all the comments above. I've attached the > > latest patches, and am going to push them (one by one) after > > self-review again. > > One more cosmetic thing in 0001 that caught my eye: > > diff --git a/src/backend/access/common/Makefile > b/src/backend/access/common/Makefile > index b9aff0ccfd..67b8cc6108 100644 > --- a/src/backend/access/common/Makefile > +++ b/src/backend/access/common/Makefile > @@ -27,6 +27,7 @@ OBJS = \ > syncscan.o \ > toast_compression.o \ > toast_internals.o \ > + tidstore.o \ > tupconvert.o \ > tupdesc.o > > diff --git a/src/backend/access/common/meson.build > b/src/backend/access/common/meson.build > index 725041a4ce..a02397855e 100644 > --- a/src/backend/access/common/meson.build > +++ b/src/backend/access/common/meson.build > @@ -15,6 +15,7 @@ backend_sources += files( > 'syncscan.c', > 'toast_compression.c', > 'toast_internals.c', > + 'tidstore.c', > 'tupconvert.c', > 'tupdesc.c', > ) > > These aren't in alphabetical order. Good catch. I'll fix them before the push. While reviewing the codes again, the following two things caught my eyes: in check_set_block_offset() function, we don't take a lock on the tidstore while checking all possible TIDs. I'll add TidStoreLockShare() and TidStoreUnlock() as follows: + TidStoreLockShare(tidstore); if (TidStoreIsMember(tidstore, &tid)) ItemPointerSet(&items.lookup_tids[num_lookup_tids++], blkno, offset); + TidStoreUnlock(tidstore); --- Regarding TidStoreMemoryUsage(), IIUC the caller doesn't need to take a lock on the shared tidstore since dsa_get_total_size() (called by RT_MEMORY_USAGE()) does appropriate locking. I think we can mention it in the comment as follows: -/* Return the memory usage of TidStore */ +/* + * Return the memory usage of TidStore. + * + * In shared TidStore cases, since shared_ts_memory_usage() does appropriate + * locking, the caller doesn't need to take a lock. + */ What do you think? Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com