Philip Guenther wrote:
Well, I am amazed. I guess I just have to do some more investigation into
workarounds for this, as RAM-based tmpfs file systems will get full very
quickly with shared memory segments, and large segments result in high disk
activity when munmap() is called. And SysV shared memory is too limiting for
my purposes.
Could you clarify what underlying problem you're trying to solve?
Maybe there's just a misunderstanding about the POSIX APIs, or about
how OpenBSD implements them, or what alternative APIs exist.
Philip Guenther
Basically:
1. shm_mkstemp() will result in the creation of a file with a long
random name in /tmp, which is quite small on the target system (512M
mfs). The system has 4GB of RAM.
- The buffers tend to be somewhat large (several dozen MB
occasionally). I cannot create a shared memory segment larger than the
remaining free space on the /tmp FS. This creates an issue where
multiple applications have a buffer each.
- On a system where /tmp is a disk partition, this file does not
actually occupy any space until munmap() is called; this results in a
long pause as it writes the data to the disk.
This occurs even if shm_unlink() is called before the unmap/close.
In fact, shm_unlink() is called immediately after shm_mkstemp(), because
I only am interested in the descriptor from now on. This disk write is
unnecessary for shared memory; I don't want the file to persist at all.
When both the application and server are done with the buffer, all
should be gone.
- The only way to resolve the issue with disk I/O is to call
ftruncate(fd, 0) before munmap(), and even this only works where /tmp is
a huge MFS partition or a huge disk partition.
- There is a brief point in time (between shm_mkstemp() and
shm_unlink()) where a rogue application can grab the buffer file, call
ftruncate(), and map its contents, without either my server or
applications knowing. Whilst file permissions are good at stopping other
users accessing the segment, this goes against the idea that the app
To summarise, the major issue I am having is that the "shared memory"
buffer looks like a file on disk, and acts like a file on disk, rather
than a section of memory that I am sharing in between processes. As a
result, I need to have a huge /tmp file system just to accommodate the
buffer, regardless of how much RAM I have.
The problem I have with the SysV shm* API is that it gets hard to clean
up shared memory segments. Whilst that I can live with to some degree,
the biggest issue with it is the 32MB limit per segment
(kern.shminfo.shmmax).