On Fri, Jun 17, 2016 at 12:34 PM, Aleksey Demakov <adema...@gmail.com> wrote: >> I expect that to be useful for parallel query and anything else where >> processes need to share variable-size data. However, that's different >> from this because ours can grown to arbitrary size and shrink again by >> allocating and freeing with DSM segments. We also do everything with >> relative pointers since DSM segments can be mapped at different >> addresses in different processes, whereas this would only work with >> memory carved out of the main shared memory segment (or some new DSM >> facility that guaranteed identical placement in every address space). >> > > I believe it would be perfectly okay to allocate huge amount of address > space with mmap on startup. If the pages are not touched, the OS VM > subsystem will not commit them.
In my opinion, that's not going to fly. If I thought otherwise, I would not have developed the DSM facility in the first place. First, the behavior in this area is highly dependent on choice of operating system and configuration parameters. We've had plenty of experience with requiring non-default configuration parameters to run PostgreSQL, and it's all bad. I don't really want to have to tell users that they must run with a particular value of vm.overcommit_memory in order to run the server. Nor do I want to tell users of other operating systems that their ability to run PostgreSQL is dependent on the behavior their OS has in this area. I had a MacBook Pro up until a year or two ago where a sufficiently shared memory request would cause a kernel panic. That bug will probably be fixed at some point if it hasn't been already, but probably by returning an error rather than making it work. Second, there's no way to give memory back once you've touched it. If you decide to do a hash join on a 250GB inner table using a shared hash table, you're going to have 250GB in swap-backed pages floating around when you're done. If the user has swap configured (and more and more people don't), the operating system will eventually page those out, but until that happens those pages are reducing the amount of page cache that's available, and after it happens they're using up swap. In either case, the space consumed is consumed to no purpose. You don't care about that hash table any more once the query completes; there's just no way to tell the operating system that. If your workload follows an entirely predictable pattern and you always have about the same amount of usage of this facility then you can just reuse the same pages and everything is fine. But if your usage fluctuates I believe it will be a big problem. With DSM, we can and do explicitly free the memory back to the OS as soon as we don't need it any more - and that's a big benefit. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers