On Tue, Jan 8, 2019 at 7:14 PM Tom Lane <t...@sss.pgh.pa.us> wrote: > I've been toying with OpenBSD lately, and soon noticed a seriously > annoying problem for running Postgres on it: by default, its limits > for SysV semaphores are only SEMMNS=60, SEMMNI=10. Not only does that > greatly constrain the number of connections for a single installation, > it means that our TAP tests fail because you can't start two postmasters > concurrently (cf [1]). > > Raising the annoyance factor considerably, AFAICT the only way to > increase these settings is to build your own custom kernel. > > So I looked around for an alternative, and found out that modern > OpenBSD releases support named POSIX semaphores (though not unnamed > ones, at least not shared unnamed ones). What's more, it appears that > in this implementation, named semaphores don't eat open file descriptors > as they do on macOS, removing our major objection to using them. > > I don't have any OpenBSD installation on hardware that I'd take very > seriously for performance testing, but some light testing with > "pgbench -S" suggests that a build with PREFERRED_SEMAPHORES=NAMED_POSIX > has just about the same performance as a build with SysV semaphores. > > This all leads to the thought that maybe we should be selecting > PREFERRED_SEMAPHORES=NAMED_POSIX on OpenBSD. At the very least, > our docs ought to recommend it as a credible alternative for > people who don't want to get into building custom kernels. > > I've checked that this works back to OpenBSD 6.0, and scanning > their man pages suggests that the feature appeared in 5.5. > 5.5 isn't that old (2014) so possibly people are still running > older versions, but we could easily put in version-specific > default logic similar to what's in src/template/darwin. > > Thoughts?
No OpenBSD here, but I was curious enough to peek at their implementation. Like others, they create a tiny file under /tmp for each one, mmap() and close the fd straight away. Apparently don't support shared sem_init() yet (EPERM). So your plan seems good to me. CC'ing Pierre-Emmanuel (OpenBSD PostgreSQL port maintainer) in case he is interested. Wild speculation: I wouldn't be surprised if POSIX named semas perform better than SysV semas on a large enough system, since they'll live on different pages. At a glance, their sys_semget apparently allocates arrays of struct sem without padding and I think they probably get about 4 to a cacheline; see our experience with an 8 socket box leading to commit 2d306759 where we added our own padding. -- Thomas Munro http://www.enterprisedb.com