On 30/08/2011 14:41, Ryan Johnson wrote:
That sounds reasonable, though I suspect we'd want want to keep the concluding bits in the FAQ as well. Unfortunately, summertime free time has come to an end so I don't know when I'll get to this next. Perhaps a good compromise for now would be for you to post only the first FAQ question? That would at least cut traffic to the cygwin ML a bit.
I've updated Ryan's patch to hopefully address the comments made, polished the language a bit in places, and split it into a patch for the FAQ which just says how to fix problems and a patch for the UG which contains the technical details.
Index: doc/faq-using.xml =================================================================== RCS file: /cvs/src/src/winsup/doc/faq-using.xml,v retrieving revision 1.35 diff -u -p -r1.35 faq-using.xml --- doc/faq-using.xml 4 Aug 2011 18:25:41 -0000 1.35 +++ doc/faq-using.xml 3 Nov 2011 16:26:56 -0000 @@ -1099,7 +1099,7 @@ it.</para> IPv6 stack, see the <ulink url="http://www.microsoft.com/technet/network/ipv6/ipv6faq.mspx">Microsoft TechNet IPv6 FAQ article</ulink> </para></answer></qandaentry> -<qandaentry id="faq.using.bloda"> +<qandaentry id="faq.using.bloda" xreflabel="BLODA"> <question><para>What applications have been found to interfere with Cygwin?</para></question> <answer> @@ -1199,3 +1199,38 @@ such as virtual memory paging and file c </listitem> </itemizedlist></para> </answer></qandaentry> + +<qandaentry id='faq.using.fixing-fork-failures'> + <question><para>How do I fix <literal>fork()</literal> failures?</para></question> + <answer> + <para>Unfortunately, Windows can be quite hostile to a + reliable fork() implementation, leading to error messages such as:</para> + <para><itemizedlist> + <listitem>unable to remap <emphasis>somedll</emphasis> to same address as parent</listitem> + <listitem>couldn't allocate heap</listitem> + <listitem>died waiting for dll loading</listitem> + <listitem>child -1 - died waiting for longjmp before initialization</listitem> + <listitem>STATUS_ACCESS_VIOLATION</listitem> + <listitem>resource temporarily unavailable</listitem> + </itemizedlist></para> + <para>Potential solutions for the above errors:</para> + <para><itemizedlist> + <listitem>Restart whatever process is trying (and failing) to use + <literal>fork()</literal>. Sometimes Windows sets up a process + environment that is even more hostile to fork() than usual.</listitem> + <listitem>Ensure that you have eliminated (not just disabled) all + software on the <xref linkend="faq.using.bloda"/>. + </listitem> + <listitem>Read the 'rebase' package README in + <literal>/usr/share/doc/rebase/</literal>, and follow the + instructions there to run 'rebaseall'.</listitem> + </itemizedlist></para> + <para>Please note that installing new packages or updating existing + ones undoes the effects of rebaseall and often causes fork() failures + to reappear. If so, just run rebaseall again. + </para> + <para>See the <ulink url="http://cygwin.com/cygwin-ug-net/highlights.html#ov-hi-process"> + process creation</ulink> section of the User's Guide for the technical reasons it is so + difficult to make <literal>fork()</literal> work reliably.</para> +</answer> +</qandaentry>
Index: doc/overview2.sgml =================================================================== RCS file: /cvs/src/src/winsup/doc/overview2.sgml,v retrieving revision 1.20 diff -u -p -r1.20 overview2.sgml --- doc/overview2.sgml 18 Sep 2010 15:58:46 -0000 1.20 +++ doc/overview2.sgml 3 Nov 2011 16:27:36 -0000 @@ -346,6 +346,60 @@ cases, stubs of each of these Win32 proc their exec'd Cygwin process to exit.</para> </sect2> +<sect3 id='ov-hi-process-problems'> +<title>Problems with process creation</title> + +<para>The semantics of <literal>fork</literal> require that a forked +child process have <emphasis>exactly</emphasis> the same address +space layout as its parent. However, Windows provides no native +support for cloning address space between processes and several +features actively undermine a reliable <literal>fork</literal> +implementation. Three issues are especially prevalent:</para> + +<para><itemizedlist> +<listitem>DLL base address collisions. Unlike *nix shared +libraries, which use "position-independent code", Windows shared +libraries assume a fixed base address. Whenever the hard-wired +address ranges of two DLLs collide (which occurs quite often), the +Windows loader must "rebase" one of them to a different +address. However, it may not resolve collisions consistently, and +may rebase a different dll and/or move it to a different address +every time. Cygwin can usually compensate for this effect when it +involves libraries opened dynamically, but collisions among +statically-linked dlls (dependencies known at compile time) are +resolved before <literal>cygwin1.dll</literal> initializes and +cannot be fixed afterward. This problem can only be solved by +removing the base address conflicts which cause the problem, +usually using the <literal>rebaseall</literal> tool.</listitem> + +<listitem>Address space layout randomization (ASLR). Starting with +Vista, Windows implements ASLR, which means that thread stacks, +heap, memory-mapped files, and statically-linked dlls are placed +at different (random) locations in each process. This behaviour +interferes with a proper <literal>fork</literal>, and if an +unmovable object (process heap or system dll) ends up at the wrong +location, Cygwin can do nothing to compensate (though it will +retry a few times automatically). In a 64-bit system, marking +executables as large address-ware and rebasing dlls to high +addresses has been reported to help, as ASLR affects only the +lower 2GB of address space.</listitem> + +<listitem>DLL injection by +<ulink url="http://cygwin.com/faq/faq.using.html#faq.using.bloda"> +BLODA</ulink>. Badly-behaved applications which +inject dlls into other processes often manage to clobber important +sections of the child's address space, leading to base address +collisions which rebasing cannot fix. The only way to resolve this +problem is to remove (usually uninstall) the offending +app.</listitem></itemizedlist></para> + +<para>In summary, current Windows implementations make it +impossible to implement a perfectly reliable fork, and occasional +fork failures are inevitable. +</para> + +</sect3> + <sect2 id="ov-hi-signals"><title>Signals</title> <para>When a Cygwin process starts, the library starts a secondary thread for