Hi, On 2021-07-08 20:53:32 -0700, Andres Freund wrote: > On 2021-07-07 20:46:38 +0900, Masahiko Sawada wrote: > > 1. Don't allocate more than 1GB. There was a discussion to eliminate > > this limitation by using MemoryContextAllocHuge() but there were > > concerns about point 2[1]. > > > > 2. Allocate the whole memory space at once. > > > > 3. Slow lookup performance (O(logN)). > > > > I’ve done some experiments in this area and would like to share the > > results and discuss ideas. > > Yea, this is a serious issue. > > > 3) could possibly be addressed to a decent degree without changing the > fundamental datastructure too much. There's some sizable and trivial > wins by just changing vac_cmp_itemptr() to compare int64s and by using > an open coded bsearch().
Just using itemptr_encode() makes array in test #1 go from 8s to 6.5s on my machine. Another thing I just noticed is that you didn't include the build times for the datastructures. They are lower than the lookups currently, but it does seem like a relevant thing to measure as well. E.g. for #1 I see the following build times array 24.943 ms tbm 206.456 ms intset 93.575 ms vtbm 134.315 ms rtbm 145.964 ms that's a significant range... Randomizing the lookup order (using a random shuffle in generate_index_tuples()) changes the benchmark results for #1 significantly: shuffled time unshuffled time array 6551.726 ms 6478.554 ms intset 67590.879 ms 10815.810 ms rtbm 17992.487 ms 2518.492 ms tbm 364.917 ms 360.128 ms vtbm 12227.884 ms 1288.123 ms FWIW, I get an assertion failure when using an assertion build: #2 0x0000561800ea02e0 in ExceptionalCondition (conditionName=0x7f9115a88e91 "found", errorType=0x7f9115a88d11 "FailedAssertion", fileName=0x7f9115a88e8a "rtbm.c", lineNumber=242) at /home/andres/src/postgresql/src/backend/utils/error/assert.c:69 #3 0x00007f9115a87645 in rtbm_add_tuples (rtbm=0x561806293280, blkno=0, offnums=0x7fffdccabb00, nitems=10) at rtbm.c:242 #4 0x00007f9115a8363d in load_rtbm (rtbm=0x561806293280, itemptrs=0x7f908a203050, nitems=10000000) at bdbench.c:618 #5 0x00007f9115a834b9 in rtbm_attach (lvtt=0x7f9115a8c300 <LVTestSubjects+352>, nitems=10000000, minblk=2139062143, maxblk=2139062143, maxoff=32639) at bdbench.c:587 #6 0x00007f9115a83837 in attach (lvtt=0x7f9115a8c300 <LVTestSubjects+352>, nitems=10000000, minblk=2139062143, maxblk=2139062143, maxoff=32639) at bdbench.c:658 #7 0x00007f9115a84190 in attach_dead_tuples (fcinfo=0x56180322d690) at bdbench.c:873 I assume you just inverted the Assert(found) assertion? Greetings, Andres Freund