On Mon, Dec 2, 2024 at 2:18 PM Dmitry Dolgov <9erthali...@gmail.com> wrote: > I've asked about that in linux-mm [1]. To my surprise, the > recommendations were to stick to creating a large mapping in advance, > and slice smaller mappings out of that, which could be resized later. > The OOM score should not be affected, and hugetlb could be avoided using > MAP_NORESERVE flag for the initial mapping (I've experimented with that, > seems to be working just fine, even if the slices are not using > MAP_NORESERVE). > > I guess that would mean I'll try to experiment with this approach as > well. But what others think? How much research do we need to do, to gain > some confidence about large shared mappings and make it realistically > acceptable?
Personally, I like this approach. It seems to me that this opens up the possibility of a system where the virtual addresses of data structures in shared memory never change, which I think will avoid an absolutely massive amount of implementation complexity. It's obviously not ideal that we have to specify in advance an upper limit on the potential size of shared_buffers, but we can live with it. It's better than what we have today; and certainly cloud providers will have no issue with pre-setting that to a reasonable value. I don't know if we can port it to other operating systems, but it seems at least possible that they offer similar primitives, or will in the future; if not, we can disable the feature on those platforms. I still think the synchronization is going to be tricky. For example when you go to shrink a mapping, you need to make sure that it's free of buffers that anyone might touch; and when you grow a mapping, you need to make sure that nobody tries to touch that address space before they grow the mapping, which goes back to my earlier point about someone doing a lookup into the buffer mapping table and finding a buffer number that is beyond the end of what they've already mapped. But I think it may be doable with sufficient cleverness. -- Robert Haas EDB: http://www.enterprisedb.com