Hi! On Mon, Dec 11, 2023 at 3:25 PM Ashutosh Bapat <ashutosh.bapat....@gmail.com> wrote:
> On Fri, Dec 8, 2023 at 11:24 PM Alexander Korotkov <aekorot...@gmail.com> > wrote: > > On Fri, Dec 8, 2023 at 3:28 PM Ashutosh Bapat > > <ashutosh.bapat....@gmail.com> wrote: > > > I did some analysis of memory consumption by bitmapsets in such cases. > > > [1] contains slides with the result of this analysis. The slides are > > > crude and quite WIP. But they will give some idea. > > > > > > [1] > https://docs.google.com/presentation/d/1S9BiAADhX-Fv9tDbx5R5Izq4blAofhZMhHcO1c-wzfI/edit?usp=sharing > > > > Thank you for sharing your analysis. I understand that usage of a > > plain bitmap becomes a problem with a large number of partitions. But > > I wonder what does "post proposed fixes" mean? Is it the fixes posted > > in [1]. If so it's very surprising for me they are reducing the > > memory footprint size. > > No. These are fixes in various threads all listed together in [1]. I > had started investigating memory consumption by Bitmapsets around the > same time. The slides are result of that investigation. I have updated > slides with this reference. > > [1] > https://www.postgresql.org/message-id/caexhw5s_kwb0rb9l3turjxsvo5uctepdskkaemb5x1etssm...@mail.gmail.com > > They reduce the memory footprint by Bitmapset because they reduce the > objects that contain the bitmapsets, thus reducing the total number of > bitmapsets produced. > Thank you Ashutosh for your work on this matter. With a large number of partitions, it definitely makes sense to reduce both Bitmapset's size as well as the number of Bitmapsets. I've checked the patchset [1] with your test suite to check the memory consumption. The results are in the table below. query | no patch | patch | no self-join removal ---------------------------------------------------------------------------------- 2-way join, non partitioned | 14792 | 15208 | 29152 2-way join, no partitionwise join | 19519576 | 19519576 | 19519576 2-way join, partitionwise join | 40851968 | 40851968 | 40851968 3-way join, non partitioned | 20632 | 21784 | 79376 3-way join, no partitionwise join | 45227224 | 45227224 | 45227224 3-way join, partitionwise join | 151655144 | 151655144 | 151655144 4-way join, non partitioned | 25816 | 27736 | 209128 4-way join, no partitionwise join | 83540712 | 83540712 | 83540712 4-way join, partitionwise join | 463960088 | 463960088 | 463960088 5-way join, non partitioned | 31000 | 33720 | 562552 5-way join, no partitionwise join | 149284376 | 149284376 | 149284376 5-way join, partitionwise join | 1663896608 | 1663896608 | 1663896608 The most noticeable thing for me is that self-join removal doesn't work with partitioned tables. I think this is the direction for future work on this subject. In non-partitioned cases, patchset gives a small memory overhead. However, the memory consumption is still much less than it is without the self-join removal. So, removing the join still lowers memory consumption even if it copies some Bitmapsets. Given that patchset [1] is required for the correctness of memory manipulations in Bitmapsets during join removals, I'm going to push it if there are no objections. Links. 1. https://www.postgresql.org/message-id/CAPpHfdtLgCryACcrmLv%3DKoq9rAB3%3Dtr5y9D84dGgvUhSCvjzjg%40mail.gmail.com ------ Regards, Alexander Korotkov