Hello pgsql-bugs list,

I have attached a patch file that I believe resolves a compatibility issue with Windows 8 RTM and postgresql. The impatient might want to just read the patch, this email is longer than it probably should be. I have CC'd Seiko Ishida who expressed an interest in Windows 8 compatibility on this list about a year ago.

We test postgres pretty heavily at my place of work (probably thousands of DBs created and exercised each day) on a number of platforms. We've been doing compatibility testing with the Windows 8 previews and everything has been working well. We are using the latest postgres release.

However, last week we upgraded from a preview version to the RTM version of Windows 8 x64, and it is clear that something changed. Since upgrading, we have been getting this error message a few times a day. Still very rare, but it never happened before the upgrade.

LOG:  could not reserve shared memory region (addr=0000000001410000) for child
0000000000000F8C: 487
LOG:  could not fork new process for connection: A blocking operation was
interrupted by a call to WSACancelBlockingCall.


This corresponds to VirtualAllocEx failing with ERROR_INVALID_ADDRESS inside win32_shmem.c (search for the error message).

Postgres uses a shared memory block to do much of its IPC. This shared memory block presumably stores pointers to itself, and so must be allocated at the same address inside every postgres process. In order to maximize the probability that this address will be available in child processes, the address should be reserved as early as possible in the lifetime of the child process (before the address space gets polluted). In order to achieve this goal, the postmaster starts its children in a suspended state and reserves the address before any code has executed in the child process.

However, there are a bunch of chunks of the virtual address space already reserved even when the child process is in this suspended state. At least some of them are memory mapped images of binaries (duh). I believe VirtualAllocEx is failing because something is already mapped (in the child) to the address the postmaster wants the shared memory segment to live at.

I wrote a small program that repeatedly starts postgres.exe in suspended mode and then tries to VirtualAllocEx 0x1410000. The address is never blocked on Windows 7, but is blocked 2% of the time on Windows 8. I attached windbg to the troublesome postgres process and used "!vadump -v" to see that there is a file mapped to the contentious address while postgres is in the suspended state. I don't know if the failure rate is this bad for all addresses or just this one, but the possibility of conflict exists, since the postmaster was willing to use this address in at least one run.

So why hasn't this ever happened before? I'm guessing that ASLR got better in the latest windows 8 patch, or maybe there's just more stuff in the virtual address space of a newborn process.

The postmaster originally decides where to place the shared memory segment by letting Windows (MapViewOfFileEx) choose where to put it. So if the postmaster ends up using address 0x1410000, and then the postgres.exe image (for example) gets mapped to that same address in the child, you'll end up with the error message above.

I assume Windows changed so that the addresses in use inside a newborn process can now conflict with the addresses returned by MapViewOfFileEx(..., NULL). These sets must have been disjoint in previous versions of windows, and postgres was relying on that behavior.

One straightforward "fix" is to specify a hardcoded address to MapViewOfFileEx instead of NULL. This address should be carefully selected such that it is in an area disjoint from the portions of the address space that are potentially reserved in a newborn process, and also unlikely to be in use inside the postmaster when it first maps the shared memory. This is pretty trivial to do for a particular version/configuration of Windows. However, I see no future-proof solution (besides making the shared segment position independent). If the hardcoded address is not available, you can always fall back on the current behavior.

On 64-bit versions of Windows, processes that do not use more than 4G or so of address space seem to always have a huge hole from about 00000000 80000000 ... 00000700 00000000. Note that you cannot reserve addresses above 8TB, so it would need to go somewhere in this hole, above 4G is probably preferable.

32-bit Windows 8 also exists. We haven't been testing on it, and so I can't confirm that the problem exists there. Assuming it does, 32-bit processes are likely to be trickier since address space is more scarce. In practice, it appears that there is usually a big hole from 10000000 ... 70000000.

There is a security problem with the fix I outline above. It bypasses ASLR to a limited degree, since the shared memory would likely end up always living at the same address. I am not certain that MapViewOfFile even tries to be unpredictable, but let's assume it does or will be someday.

This security problem can be addressed by adding a random number to the hardcoded address. Interfacing with a suitable entropy source/PRNG might prove to be a PITA, but there is a way of avoiding that. We can invoke MapViewOfFile once with NULL in order to get a "random address" and then sum the least significant bits of that with our hardcoded base address to get the preferred address for the shared segment. This way we end up with an address that is no less secure than the one currently returned by MapViewOfFile, insofar as MapViewOfFile doesn't select high addresses.

I've attached a patch that implements the stuff above. I can share the code for the program that tests whether an address is reliably available in a newborn postgres process, if anyone is interested.

- Dave Vitek
--- /tmp/port/win32_shmem.c     2012-09-04 22:36:08.000000000 -0400
+++ third-party/postgresql/src/backend/port/win32_shmem.c       2012-09-04 
23:18:47.000000000 -0400
@@ -120,6 +120,7 @@
 PGSharedMemoryCreate(Size size, bool makePrivate, int port)
 {
        void       *memAddress;
+       void       *preferredAddress;
        PGShmemHeader *hdr;
        HANDLE          hmap,
                                hmap2;
@@ -224,6 +225,65 @@
                                (errmsg("could not create shared memory 
segment: %lu", GetLastError()),
                                 errdetail("Failed system call was 
MapViewOfFileEx.")));
 
+       /* 
+        * Now try to allocate memory at an address that is less
+        * contentious in newborn child processes.  Because of ASLR,
+        * Windows can memory map images (like postgres.exe) such that
+        * they conflict with memAddress as we have selected it above.
+        * Microsoft provides no assurance that MapViewOfFileEx(...,
+        * NULL) returns addresses disjoint from those in use in a
+        * newborn process.  They have been observed to intersect from
+        * time to time on Windows 8.  See email to pg-bugs from
+        * Dave Vitek on September 4 2012.
+        *
+        * Now we set preferredAddress to a hardcoded address that
+        * newborn processes never seem to be using (in available
+        * versions of Windows).  I have selected the addresses
+        * somewhat randomly in order to minimize the probability that
+        * some other library doing something similar conflicts with
+        * us.  That is, using conspicuous addresses like 0x20000000
+        * might not be good if someone else does it.
+        * 
+        * GRAMMATECH: See internal tracker BZ:9250.
+        */
+#ifdef _WIN64
+       /* 
+        * There is typically a giant hole (almost 8TB):
+        * 00000000 7fff0000
+        * ...
+        * 000007f6 8e8b0000
+        */
+       preferredAddress = (void*)0x0000047047e00000ULL;
+#else
+       /* 
+        * This is more dicey.  However, even with ASLR there still
+        * seems to be a big hole:
+        * 10000000
+        * ...
+        * 70000000
+        */
+       preferredAddress = 0x2efe0000;
+#endif
+       /* 
+        * In order to be faithful to ASLR, we maintain any entropy
+        * from the bottom bits of memAddress when selecting the
+        * preferredAddress.  I'm not sure MapViewOfFile attempts to
+        * be unpredictable, but better to be future-proof even if it
+        * doesn't.
+        */
+       preferredAddress = (void*)(((UINT_PTR)preferredAddress) + 
(((UINT_PTR)memAddress) & ((UINT_PTR)0x0fffffff)));
+       preferredAddress = MapViewOfFileEx(hmap2, FILE_MAP_WRITE | 
FILE_MAP_READ, 0, 0, 0, preferredAddress);
+       if( preferredAddress )
+       {
+               if (!UnmapViewOfFile(memAddress))
+                       elog(LOG, "could not unmap view of unwanted shared 
memory: %lu", GetLastError());
+               memAddress = preferredAddress;
+       }
+       else
+       {
+               elog(LOG, "could not create shared memory segment at preferred 
address (%p): %lu",
+                       preferredAddress, GetLastError());
+       }
 
 
        /*
-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Reply via email to