On Tue, Mar 12, 2024 at 7:34 PM John Naylor <johncnaylo...@gmail.com> wrote: > > On Mon, Mar 11, 2024 at 3:13 PM Masahiko Sawada <sawada.m...@gmail.com> wrote: > > > > On Mon, Mar 11, 2024 at 12:20 PM John Naylor <johncnaylo...@gmail.com> > > wrote: > > > > > > On Thu, Mar 7, 2024 at 10:35 PM Masahiko Sawada <sawada.m...@gmail.com> > > > wrote: > > > > + ts->context = CurrentMemoryContext; > > > > > > As far as I can tell, this member is never accessed again -- am I > > > missing something? > > > > You're right. It was used to re-create the tidstore in the same > > context again while resetting it, but we no longer support the reset > > API. Considering it again, would it be better to allocate the iterator > > struct in the same context as we store the tidstore struct? > > That makes sense. > > > > + /* DSA for tidstore will be detached at the end of session */ > > > > > > No other test module pins the mapping, but that doesn't necessarily > > > mean it's wrong. Is there some advantage over explicitly detaching? > > > > One small benefit of not explicitly detaching dsa_area in > > tidstore_destroy() would be simplicity; IIUC if we want to do that, we > > need to remember the dsa_area using (for example) a static variable, > > and free it if it's non-NULL. I've implemented this idea in the > > attached patch. > > Okay, I don't have a strong preference at this point.
I'd keep the update on that. > > > > +-- Add tids in random order. > > > > > > I don't see any randomization here. I do remember adding row_number to > > > remove whitespace in the output, but I don't remember a random order. > > > On that subject, the row_number was an easy trick to avoid extra > > > whitespace, but maybe we should just teach the setting function to > > > return blocknumber rather than null? > > > > Good idea, fixed. > > + test_set_block_offsets > +------------------------ > + 2147483647 > + 0 > + 4294967294 > + 1 > + 4294967295 > > Hmm, was the earlier comment about randomness referring to this? I'm > not sure what other regression tests do in these cases, or how > relibale this is. If this is a problem we could simply insert this > result into a temp table so it's not output. I didn't address the comment about randomness. I think that we will have both random TIDs tests and fixed TIDs tests in test_tidstore as we discussed, and probably we can do both tests with similar steps; insert TIDs into both a temp table and tidstore and check if the tidstore returned the results as expected by comparing the results to the temp table. Probably we can have a common pl/pgsql function that checks that and raises a WARNING or an ERROR. Given that this is very similar to what we did in test_radixtree, why do we really want to implement it using a pl/pgsql function? When we discussed it before, I found the current way makes sense. But given that we're adding more tests and will add more tests in the future, doing the tests in C will be more maintainable and faster. Also, I think we can do the debug-build array stuff in the test_tidstore code instead. > > > > +Datum > > > +tidstore_create(PG_FUNCTION_ARGS) > > > +{ > > > ... > > > + tidstore = TidStoreCreate(max_bytes, dsa); > > > > > > +Datum > > > +tidstore_set_block_offsets(PG_FUNCTION_ARGS) > > > +{ > > > .... > > > + TidStoreSetBlockOffsets(tidstore, blkno, offs, noffs); > > > > > > These names are too similar. Maybe the test module should do > > > s/tidstore_/test_/ or similar. > > > > Agreed. > > Mostly okay, although a couple look a bit generic now. I'll leave it > up to you if you want to tweak things. > > > > In general, the .sql file is still very hard-coded. Functions are > > > created that contain a VALUES statement. Maybe it's okay for now, but > > > wanted to mention it. Ideally, we'd have some randomized tests, > > > without having to display it. That could be in addition to (not > > > replacing) the small tests we have that display input. (see below) > > > > > > > Agreed to add randomized tests in addition to the existing tests. > > I'll try something tomorrow. > > > Sounds a good idea. In fact, if there are some bugs in tidstore, it's > > likely that even initdb would fail in practice. However, it's a very > > good idea that we can test the tidstore anyway with such a check > > without a debug-build array. > > > > Or as another idea, I wonder if we could keep the debug-build array in > > some form. For example, we use the array with the particular build > > flag and set a BF animal for that. That way, we can test the tidstore > > in more real cases. > > I think the purpose of a debug flag is to help developers catch > mistakes. I don't think it's quite useful enough for that. For one, it > has the same 1GB limitation as vacuum's current array. For another, > it'd be a terrible way to debug moving tidbitmap.c from its hash table > to use TID store -- AND/OR operations and lossy pages are pretty much > undoable with a copy of vacuum's array. Valid points. As I mentioned above, if we implement the test cases in C, we can use the debug-build array in the test code. And we won't use it in AND/OR operations tests in the future. > > > In the latest (v69) patch: > > > > - squashed v68-0005 and v68-0006 patches. > > - removed most of the changes in v68-0007 patch. > > - addressed above review comments in v69-0002 patch. > > - v69-0003, 0004, and 0005 are miscellaneous updates. > > > > As for renaming TidStore to TIDStore, I dropped the patch for now > > since it seems we're using "Tid" in some function names and variable > > names. If we want to update it, we can do that later. > > I think we're not consistent across the codebase, and it's fine to > drop that patch. > > v70-0008: > > @@ -489,7 +489,7 @@ parallel_vacuum_reset_dead_items(ParallelVacuumState *pvs) > /* > * Free the current tidstore and return allocated DSA segments to the > * operating system. Then we recreate the tidstore with the same max_bytes > - * limitation. > + * limitation we just used. > > Nowadays, max_bytes is now more like a hint for tidstore, and not a > limitation, right? Vacuum has the limitation. Right. > Maybe instead of "with", > we should say "passing the same limitation". Will fix. > > I wonder how "di_info" would look as "dead_items_info". I don't feel > too strongly about it, though. Agreed. > > I'm going to try additional regression tests, as mentioned, and try a > couple benchmarks. It should be only a couple more days. Thank you! > One thing that occurred to me: The radix tree regression tests only > compile and run the local memory case. The tidstore commit would be > the first time the buildfarm has seen the shared memory case, so we > should look out for possible build failures of the same sort we saw > with the the radix tree tests. I see you've already removed the > problematic link_with command -- that's the kind of thing to > double-check for. Good point, agreed. I'll double-check it again. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com