On 12/25/2024 1:08 PM, Ken Brown wrote:
Hi Corinna,
I know you're on vacation, but I'm writing this now while it's fresh in
my mind. We can discuss it when you return.
This is a follow-up to
https://cygwin.com/pipermail/cygwin-developers/2024-December/012715.html
but I'm starting a new thread for clarity, and I'll start from scratch.
I'm back to thinking that mmap_is_attached_or_noreserve is broken. I'm
attaching two test cases to show this. To run them, create a file
hello.txt with one line, and then compile and run the two programs.
They both take a region of size 128K and map the two halves of it
separately with MAP_NORESERVE. The only difference between the two
cases is the order in which the halves are mapped. They then call
read(), passing the whole region of size 128K as buffer. If
mmap_is_attached_or_noreserve is working, the calls to read should
succeed. But only the one in mmap_noreserve2 succeeds, and the other
fails with a bus error. I'll now try to explain what's going on.
The task is to find one or more noreserve records that completely cover
the interval [addr, addr+len].
After a successful call to match(start_addr, len, u_addr, u_len), we
have the following situation:
|---------|---------------------|-----------------|
start_addr u_addr u_addr+u_len start_addr+len
The interval [u_addr, u_addr+u_len] is the intersection of [start_addr,
start_addr+len] with the interval of the mmap_record. Note that u_addr
>= start_addr and u_addr+u_len <= start_addr+len, but I've shown in the
picture the case where both inequalities are strict.
At this point we know that the intersection is contained in a noreserve
mmap_record. But we commit the entire interval [start_addr,
u_addr+u_len], whose length is commit_len, which is written as
u_len - (start_addr - u_addr) = (u_addr + u_len) - start_addr.
There's no problem if u_addr = start_addr. But if u_addr > start_addr,
then the commit includes [start_addr, u_addr], which is not at this
point known to be contained in a noreserve mmap_record. Moreover, even
if it is contained in a noreserve record that we just haven't come to
yet, experiments show that the call to VirtualAlloc in line 794 fails in
this case. Apparently you can't commit in one call to VirtualAlloc a
region that was reserved [Windows terminology] in two separate calls. I
can't find this documented anywhere, but that's what happens in the
attached mmap_noreserve1.c, as I verified by stepping through the code
under gdb.
As I said in an earlier message, I don't see how to fix this except by
making sure that the noreserve maps in map_list occur in order of their
starting address. Then if we ever find u_addr > start_addr, we know
we'll never find a noreserve mmap_record covering [start_addr, u_addr],
and we can break out of the loop. If you agree that there's a problem,
does this seem like the right fix?
I apologize for the length of this message, but I wanted to write
everything out in detail in order to be clear. I've written a patch
(also attached) that attempts to do does what I suggested in the
previous paragraph. Unfortunately, when I install it, mintty hangs on
startup without showing the bash prompt, and a grep process is left
running. So there must be something wrong with my patch, or else it
triggers a latent bug somewhere else.
Takashi, if you're reading this, could you take a look at my patch and
see if you spot any obvious errors? Or do you think this could be
related to your recent changes?
I've find two errors, one in my patch and one in the original code:
There were two places in the main loop of mmap_is_attached_or_noreserve
where there was a "break" that should have been "continue". Version 2
of the patch (attached) tries to fix these. But I still get the same
mintty hang on startup. So there's still something wrong.
Note that the original bug that I'm trying to fix caused SIGBUS to be
raised in some places where it shouldn't have been (due to a failed call
to VirtualAlloc). The new break/continue bug I just found had the
opposite effect: it caused SIGBUS not to be raised in some places where
it should have been.
Ken
Ken
From 02ed9abe21941533b36921491952038512cbfa89 Mon Sep 17 00:00:00 2001
From: Ken Brown <kbr...@cornell.edu>
Date: Wed, 25 Dec 2024 12:35:41 -0500
Subject: [PATCH] WIP v2
---
winsup/cygwin/mm/mmap.cc | 95 +++++++++++++++++++++++++++++-----------
1 file changed, 70 insertions(+), 25 deletions(-)
diff --git a/winsup/cygwin/mm/mmap.cc b/winsup/cygwin/mm/mmap.cc
index 13418d782baf..5117d1820d38 100644
--- a/winsup/cygwin/mm/mmap.cc
+++ b/winsup/cygwin/mm/mmap.cc
@@ -580,18 +580,47 @@ mmap_record::free_fh (fhandler_base *fh)
delete fh;
}
+/* We want the list of anonymous mappings to have all of its noreserve
+ mappings at the beginning of the list and sorted by address. This
+ is used in mmap_is_attached_or_noreserve. */
mmap_record *
mmap_list::add_record (mmap_record &r)
{
- mmap_record *rec = (mmap_record *) ccalloc (HEAP_MMAP,
- sizeof (mmap_record)
- + MAPSIZE (PAGE_CNT (r.get_len ())) * sizeof (DWORD), 1);
- if (!rec)
+ mmap_record *new_rec = (mmap_record *) ccalloc (HEAP_MMAP,
+ sizeof (mmap_record)
+ + MAPSIZE (PAGE_CNT (r.get_len ()))
+ * sizeof (DWORD), 1);
+ if (!new_rec)
return NULL;
- rec->init_page_map (r);
+ new_rec->init_page_map (r);
+
+ LIST_WRITE_LOCK ();
+ if (fd != -1 || LIST_EMPTY (&recs))
+ {
+ LIST_INSERT_HEAD (&recs, new_rec, mr_next);
+ goto out;
+ }
- LIST_INSERT_HEAD (&recs, rec, mr_next);
- return rec;
+ /* Now we're working on the anonymous list, which is non-empty. */
+ caddr_t new_addr = new_rec->get_address ();
+ bool new_noreserve = new_rec->noreserve ();
+ mmap_record *rec, *prev;
+ LIST_FOREACH (rec, &recs, mr_next)
+ {
+ prev = rec;
+ if (!rec->noreserve () || (new_noreserve
+ && new_addr < rec->get_address ()))
+ {
+ LIST_INSERT_BEFORE (rec, new_rec, mr_next);
+ goto out;
+ }
+ }
+ /* We've gone through the list, and new_rec should be inserted at
+ the end; prev points to the last record. */
+ LIST_INSERT_AFTER (prev, new_rec, mr_next);
+out:
+ LIST_WRITE_UNLOCK ();
+ return new_rec;
}
void
@@ -738,18 +767,18 @@ is_mmapped_region (caddr_t start_addr, caddr_t
end_address)
/* This function is called from exception_handler when a segmentation
violation has occurred. It should also be called from all Cygwin
- functions that want to support passing noreserve mmap page addresses
- to Windows system calls. In that case, it should be called only after
- a system call indicates that the application buffer passed had an
- invalid virtual address to avoid any performance impact in non-noreserve
- cases.
+ functions that want to support passing noreserve (anonymous) mmap
+ page addresses to Windows system calls. In that case, it should be
+ called only after a system call indicates that the application
+ buffer passed had an invalid virtual address to avoid any
+ performance impact in non-noreserve cases.
Check if the address range is all within noreserve mmap regions. If so,
call VirtualAlloc to commit the pages and return MMAP_NORESERVE_COMMITED
- on success. If the page has __PROT_ATTACH (SUSv3 memory protection
- extension), or if VirtualAlloc fails, return MMAP_RAISE_SIGBUS.
- Otherwise, return MMAP_NONE if the address range is not covered by an
- attached or noreserve map.
+ on success. If some page in the address range has __PROT_ATTACH
+ (SUSv3 memory protection extension), or if VirtualAlloc fails,
+ return MMAP_RAISE_SIGBUS. Otherwise, return MMAP_NONE if the
+ address range is not covered by noreserve maps.
On MAP_NORESERVE_COMMITED, the exeception handler should return 0 to
allow the application to retry the memory access, or the calling Cygwin
@@ -784,22 +813,38 @@ mmap_is_attached_or_noreserve (void *addr, size_t len)
ret = MMAP_RAISE_SIGBUS;
break;
}
- if (!rec->noreserve ())
- break;
-
- size_t commit_len = u_len - (start_addr - u_addr);
- if (commit_len > len)
- commit_len = len;
+ /* We now have the following situation:
+
+ |---------|---------------------|-----------------|
+ start_addr u_addr u_addr+u_len start_addr+len
+
+ The interval [u_addr, u_addr+u_len] is the intersection of
+ [start_addr, start_addr+len] with the interval of the
+ mmap_record *rec. Note that u_addr >= start_addr and
+ u_addr+u_len <= start_addr+len. The picture shows the case
+ where both inequalities are strict.
+
+ If u_addr > start_addr, then [start_addr, u_addr] lies to the
+ left of the region of *rec. The sorting of map_list now
+ implies that [start_addr, u_addr] is not covered by noreserve
+ maps. For the same reason, the interval is not covered by
+ noreserve maps if !rec->noreserve (). But we still have to
+ continue through the loop in case we later encounter a map
+ with rec->attached () that intersects our region, in which
+ case we raise SIGBUS. */
+
+ if (u_addr > start_addr || !rec->noreserve ())
+ continue;
- if (!VirtualAlloc (start_addr, commit_len, MEM_COMMIT,
+ if (!VirtualAlloc (start_addr, u_len, MEM_COMMIT,
rec->gen_protect ()))
{
ret = MMAP_RAISE_SIGBUS;
break;
}
- start_addr += commit_len;
- len -= commit_len;
+ start_addr += u_len;
+ len -= u_len;
if (!len)
{
ret = MMAP_NORESERVE_COMMITED;
--
2.45.1