Re: Core dump on 32-bit Cygwin if program calls dlopen
Hi JonY, On Jul 15 16:39, Corinna Vinschen wrote: > On Jul 15 21:55, JonY wrote: > > On 7/15/2014 21:08, Corinna Vinschen wrote: > > >> > > >> FWIW, the problem disappears if I revert gcc-core and libgcc1 to 4.8.2-2. > > > > > > JonY, do you have a chance to have a look into this issue? > > > > > > > Sorry, I have been busy these few weeks, but I am well aware that there > > is a problem with one of the libgcc changes, but has yet to investigate it. > > > > I believe Jon Turney has looked into it somewhat. > > Sounds good. Thanks in advance. Yesterday I asked my collegues to take a stab at the issue and one of them, DJ Delorie, came up with a libgcc patch already. It hasn't been sent upstream yet. Can we give it a try, perhaps by creating a new libgcc DLL, please? Thanks, Corinna Index: libgcc/config/i386/cygming-crtbegin.c === --- libgcc/config/i386/cygming-crtbegin.c (revision 212546) +++ libgcc/config/i386/cygming-crtbegin.c (working copy) @@ -99,12 +99,13 @@ static EH_FRAME_SECTION_CONST char __EH_ = { }; static struct object obj; /* Handle of libgcc's DLL reference. */ HANDLE hmod_libgcc; +static void * (*deregister_frame_fn) (const void *) = NULL; #endif #if TARGET_USE_JCR_SECTION static void *__JCR_LIST__[] __attribute__ ((used, section(JCR_SECTION_NAME), aligned(4))) = { }; @@ -130,15 +131,20 @@ __gcc_register_frame (void) if (h) { /* Increasing the load-count of LIBGCC_SONAME DLL. */ hmod_libgcc = LoadLibrary (LIBGCC_SONAME); register_frame_fn = (void (*) (const void *, struct object *)) GetProcAddress (h, "__register_frame_info"); + deregister_frame_fn = (void* (*) (const void *)) + GetProcAddress (h, "__deregister_frame_info"); +} + else +{ + register_frame_fn = __register_frame_info; + deregister_frame_fn = __deregister_frame_info; } - else -register_frame_fn = __register_frame_info; if (register_frame_fn) register_frame_fn (__EH_FRAME_BEGIN__, &obj); #endif #if TARGET_USE_JCR_SECTION if (__JCR_LIST__[0]) @@ -158,19 +164,12 @@ __gcc_register_frame (void) } void __gcc_deregister_frame (void) { #if DWARF2_UNWIND_INFO - void * (*deregister_frame_fn) (const void *); - HANDLE h = GetModuleHandle (LIBGCC_SONAME); - if (h) -deregister_frame_fn = (void* (*) (const void *)) - GetProcAddress (h, "__deregister_frame_info"); - else -deregister_frame_fn = __deregister_frame_info; if (deregister_frame_fn) deregister_frame_fn (__EH_FRAME_BEGIN__); if (hmod_libgcc) FreeLibrary (hmod_libgcc); #endif } -- Corinna Vinschen Please, send mails regarding Cygwin to Cygwin Maintainer cygwin AT cygwin DOT com Red Hat pgp1bOmSHmMKi.pgp Description: PGP signature
RE: pipe handling errors
From: Christopher Faylor >On Tue, Jul 15, 2014 at 06:05:27PM -0400, Christopher Faylor wrote: >>Yes, I saw that, but I can't duplicate the problem with that command sequence. > >I took a stab at another change which may ameliorate the problem. Please try >the latest snapshot. Indeed, with this snapshot I was not able to reproduce the problem. Thank you! Cygwin64> uname -srvmo CYGWIN_NT-6.1 1.7.31s(0.272/5/3) 20140716 11:15:29 x86_64 Cygwin Cygwin64> --Ken Nellis -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
Re: timeout in LDAP access
On Jul 15 18:29, Denis Excoffier wrote: > On 2014-07-14 15:48 Corinna Vinschen wrote: > > On Jul 14 11:51, Corinna Vinschen wrote: > >> On Jul 12 15:39, Denis Excoffier wrote: > >>> On 2014-07-09 12:12 Corinna Vinschen wrote: > > > > I have encountered this case in real life. The domain admins have set > > the trustPosixOffset of the secondary domain to zero. This value is > > therefore > > never recorded and the cldap->open occurs again and again. > > Ouch. Why on earth are admins doing this? There's no way to > workaround this reliably. > > >>> Reliably i don’t know. I’ve modified uinfo.cc in order that the special > >>> value > >>> for td->PosixOffset is no longer 0. Taking into account that > >>> LDAP_SERVER_DOWN > >>> is now recognized, my ‘getent passwd’ executes gracefully in 40 minutes > >>> (instead of 60) and ‘getent group’ in 25 minutes (instead of 90). Also > >>> quicker > >>> is ‘mkpasswd -d secondary_domain’ of course. Patch attached. > >> > >> That won't work. It works around your immediate problem by defining > >> a non-0 start value, no doubt about that, but it doesn't fix the > >> underlying problem. > >> > >> A POSIX offset of 0 is bad. If other trusted domains have no functional > >> POSIX offset value, but are set to 0 instead, they won't have different > >> UID values for accounts of different domains. Two users from different > >> domains, both with RID 1000 will both have UID 1000 in Cygwin. Also, > >> the lower UID numbers are reserved for special accounts. > >> > >> There is no guarantee that there won't be a collision at some point of > >> the 32 bit UID spectrum, but a POSIX offset of 0 will almost guarantee > >> the collision. > >> > >> There are two ways to workaround that. > >> > >> - The better solution is to inform your IT of the problem. > >> > >> - The not so well one is to enhance /etc/nsswitch.conf to allow to > >> define POSIX offsets for domains indepedent of the AD setting. > > > > I tried the third solution for the time being, which is, generating the > > fake POSIX offset a bit differently. Fake offsets are a bit dangerous > > in that there's no guarantee that you get a stable mapping between SID > > and UID/GID, but it's *hopefully* a border situation we're trying to > > workaround. Please give the latest developer snashot from > > http://cygwin.com/snapshots/ a try. > Tried and it works as expected. However there is a design bug. Suppose you > have a SID from a non-primary domain (with PosixOffset=0). When you enumerate, > you get a PosixOffset that takes into account the previously encountered > secondary domains with PosixOffset=0, say you get > UNIX_POSIX_OFFSET-3*0x0080 That was, actually, not a design bug but a deliberate decision. In some way we have to work with accounts from a badly defined domain, but for those getent isn't the problem. You don't need to enumerate all domains except in very rare cases. What should work, though, is to ls -l files and see the correct owner of a file and to chmod the files. For everything else I would opt for kicking your IT. Keep in mind that AD chooses more or less sane POSIX offsets for trusted domains by default. Setting it to 0 is an entirely gratuitious act by the admin. A service desk ticket might be helpful. > Independently, i’m still not sure we have to workaround IT "madness" at all. > First, IT > people might set PosixOffset to 1 for each domain and you cannot catch this > kind > of alternate madness. Also, be sure that if some user someday suffers from a > duplicate Yes, you can. IT has to know there's software running which needs sane POSIX offset settings. Alternatively we can still implement some other workaround at one point. It occured to me that there's another way to do that. The problem you're mentioning above could be alleviated if the first Cygwin process in a process tree fetches all POSIX offsets of all trusted domains right at the start, rather than fetching the POSIX offsets only on demand by whatever process needs it. This would slow down the startup of the first process slightly (one LDAP request per trusted domain, but only asking your primary DC), but this would have two advantages: - After fetching all POSIX offsets, we could filter out all POSIX offsets which don't make sense. These would be set using the fake offset setting mechanism. "No sense" would include offsets < 0x11 or offsets > 0xff00. If the first process in the tree - The UID/GID values would be stable throughout the process tree. - The UID/GID values would be stable systemwide when utilizing cygserver. That's a bit of work, but Cygwin 1.7.31 will still come without this AD integration code anyway, so we still have time to turn everything upside down. Corinna -- Corinna Vinschen Please, send mails regarding Cygwin to Cygwin Maintainer cygwin AT cygwin DOT com Red Hat pgp5BBeu9iqX6.pgp D
Re: pipe handling errors
On Wed, Jul 16, 2014 at 12:44:25PM +, Nellis, Kenneth wrote: >From: Christopher Faylor >>On Tue, Jul 15, 2014 at 06:05:27PM -0400, Christopher Faylor wrote: >>>Yes, I saw that, but I can't duplicate the problem with that command >>>sequence. >> >>I took a stab at another change which may ameliorate the problem. Please try >>the latest snapshot. > >Indeed, with this snapshot I was not able to reproduce the problem. Thank you! > >Cygwin64> uname -srvmo >CYGWIN_NT-6.1 1.7.31s(0.272/5/3) 20140716 11:15:29 x86_64 Cygwin Good to hear. Thanks for confirming. For the curious, I increased the size of the signal pipe buffer. I should have done that when I made signal pipes "nowait" to work around problems with gdb a couple of releases ago. I also made the signal sender retry if WriteFile returns success but the number of bytes sent was not what was requested (since these are message-style pipes the size should always be either zero or correct). Increasing the size of the buffer should have been enough to fix the problem but, when possible, I like to use two forms of protection when I fix a bug. -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
Re: pipe handling errors
Christopher Faylor wrote: > Increasing the size of the buffer should have been enough to fix the > problem but, when possible, I like to use two forms of protection when I > fix a bug. I recall someone on a project here doing a bugfix with a commit log like: -mm-dd The Guy's Name Fix #n foo.cc (func1): Belt. bar.cc (func2): Suspenders. Anyhow, I'm looking forward to trying out the fix myself. We've been getting those errors here but not nearly as often. -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
Some programs (vi, ssh) crash when screen buffer height is big
Environment CYGWIN_NT-6.1 1.7.29(0.272/5/3) 2014-04-07 13:46 Windows 7 Steps to reproduce the issue: - With vi.exe Execute the following bash script: #!/bin/bash for i in {1..123}; do echo -e "\033[5A\033[50C\033[0;35mhello\033[0m" head -n1000 /var/log/setup.log done vi /var/log/setup.log vi breaks with something as: 0 [main] vi 13200 C:\cygwin64\bin\vi.exe: *** fatal error - cmalloc would have returned NULL /4.sh: line 6: 13200 Hangup vi /var/log/setup.log (note 4.sh is the file name I used to put the above script in and run it from there) and leaves a vi.exe.stackdump with the following contents: Stack trace: FrameFunctionArgs 001004D2D08 0018006F26E (001801E8666, 001801E8DD9, 000, 0229480) 001004D2D08 00180046E32 (022A4E8, FF00808080, 00FF00, FF00FF00FF) 001004D2D08 00180046E72 (001801E8643, 000, 000, 000) 001004D2D08 00180043983 (00076D22F7E, 000, 000, 000) 001004D2D08 0018007B781 (00FF00, FF00FF00FF, FF, 0018088) 001004D2D08 0018007B91F (000, 000, 000, 000) 001004D2D08 0018007E024 (000, 000, 000, 000) 001004D2D00 001801266FD (000, 000, 1A1311121C011615, 001802E2788) 001004D4160 0018011197B (000, 000, 1A1311121C011615, 001802E2788) End of stack trace - With ssh.exe ssh to some machine (Linux in my case) and execute the following bash script: #!/bin/bash for i in {1..123}; do echo -e "\033[5A\033[50C\033[0;35mhello\033[0m" head -n1000 /var/log/dmesg done vi /var/log/dmesg ssh breaks with: 0 [main] ssh 12464 C:\cygwin64\bin\ssh.exe: *** fatal error - cmalloc would have returned NULL and leaves a ssh.exe.stackdump file in the current working directory with the following contents: Stack trace: FrameFunctionArgs 006000A267F 0018006F26E (001801E8666, 001801E8DD9, 000, 0226B60) 006000A267F 00180046E32 (0227BC8, FF00808080, 00FF00, FF00FF00FF) 006000A267F 00180046E72 (001801E8643, 000, 000, 0010002) 006000A267F 00180043983 (00076D22F7E, 000, 0018007B522, 000270E) 006000A267F 0018007B781 (00FF00, FF00FF00FF, FF, 0018088) 006000A267F 0018007B91F (1DC, 000, 000, 000) 006000A267F 0018007E024 (00600077990, 007, 00600077990, 007) 006000A0490 001801266FD (00100426798, 000, 000, 000) 0060006E850 0018011197B (000, 000, 000, 000) 0060006E850 0004000 (000, 000, 000, 21EF) 0060006E850 00100426798 (000, 001004928A0, 2BE9E0C5343523AB, 0228090) 0060006E850 00600068670 (001004928A0, 2BE9E0C5343523AB, 0228090, 000) 0060006E850 003FEF96000 (001004928A0, 2BE9E0C5343523AB, 0228090, 000) End of stack trace More information: - For both variants, you may need to tweak the number 123 in the for loop above or the location of files if you don't have these - they should be there, but if not pick any file with some log-like text in it (a few hundred lines should be enough) - The variants are just quick and dirty ways to reproduce - crashes happen in regular work in various situations (i.e. real scenarios, not contrived as above) - Crashes can be reproduced always - Taking the same steps as above when running from Cygwin terminal (i.e. the one that comes bundled with Cygwin itself) does not result in a crash -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
Re: Some programs (vi, ssh) crash when screen buffer height is big
A few more things to add: - This crashes under the regular Windows console, i.e. run cmd.exe, then bash, then follow the above - It also crashes under some other emulators (I actually noticed it under ConEmu, see https://code.google.com/p/conemu-maximus5/issues/detail?id=1644), though this is likely due to the fact the regular Windows console is used underneath - When I said "screen buffer size", I mean the option in the Windows console - right click on the cmd.exe taskbar, Properties, Layout tab, Screen Buffer Size / Height - I could reproduce with 1000. It does not seem to be reproducible with low values (e.g. I tried with 100, 200, 500 and it seemed to work - not sure if it would have broken later, but it's not always reproducible as it is with 1000+) On Wed, Jul 16, 2014 at 4:10 PM, sous lesquels wrote: > Environment > > CYGWIN_NT-6.1 1.7.29(0.272/5/3) 2014-04-07 13:46 > Windows 7 > > Steps to reproduce the issue: > > - With vi.exe > > Execute the following bash script: > > #!/bin/bash > for i in {1..123}; do > echo -e "\033[5A\033[50C\033[0;35mhello\033[0m" > head -n1000 /var/log/setup.log > done > vi /var/log/setup.log > > vi breaks with something as: > > 0 [main] vi 13200 C:\cygwin64\bin\vi.exe: *** fatal error - > cmalloc would have returned NULL > /4.sh: line 6: 13200 Hangup vi /var/log/setup.log > > (note 4.sh is the file name I used to put the above script in and run > it from there) and leaves a vi.exe.stackdump with the following > contents: > > Stack trace: > FrameFunctionArgs > 001004D2D08 0018006F26E (001801E8666, 001801E8DD9, 000, 0229480) > 001004D2D08 00180046E32 (022A4E8, FF00808080, 00FF00, > FF00FF00FF) > 001004D2D08 00180046E72 (001801E8643, 000, 000, 000) > 001004D2D08 00180043983 (00076D22F7E, 000, 000, 000) > 001004D2D08 0018007B781 (00FF00, FF00FF00FF, > FF, 0018088) > 001004D2D08 0018007B91F (000, 000, 000, 000) > 001004D2D08 0018007E024 (000, 000, 000, 000) > 001004D2D00 001801266FD (000, 000, 1A1311121C011615, > 001802E2788) > 001004D4160 0018011197B (000, 000, 1A1311121C011615, > 001802E2788) > End of stack trace > > - With ssh.exe > > ssh to some machine (Linux in my case) and execute the following bash script: > > #!/bin/bash > for i in {1..123}; do > echo -e "\033[5A\033[50C\033[0;35mhello\033[0m" > head -n1000 /var/log/dmesg > done > vi /var/log/dmesg > > ssh breaks with: > > 0 [main] ssh 12464 C:\cygwin64\bin\ssh.exe: *** fatal error - cmalloc > would have returned NULL > > and leaves a ssh.exe.stackdump file in the current working directory > with the following contents: > > Stack trace: > FrameFunctionArgs > 006000A267F 0018006F26E (001801E8666, 001801E8DD9, 000, 0226B60) > 006000A267F 00180046E32 (0227BC8, FF00808080, 00FF00, > FF00FF00FF) > 006000A267F 00180046E72 (001801E8643, 000, 000, 0010002) > 006000A267F 00180043983 (00076D22F7E, 000, 0018007B522, 000270E) > 006000A267F 0018007B781 (00FF00, FF00FF00FF, > FF, 0018088) > 006000A267F 0018007B91F (1DC, 000, 000, 000) > 006000A267F 0018007E024 (00600077990, 007, 00600077990, 007) > 006000A0490 001801266FD (00100426798, 000, 000, 000) > 0060006E850 0018011197B (000, 000, 000, 000) > 0060006E850 0004000 (000, 000, 000, 21EF) > 0060006E850 00100426798 (000, 001004928A0, 2BE9E0C5343523AB, > 0228090) > 0060006E850 00600068670 (001004928A0, 2BE9E0C5343523AB, 0228090, > 000) > 0060006E850 003FEF96000 (001004928A0, 2BE9E0C5343523AB, 0228090, > 000) > End of stack trace > > More information: > > - For both variants, you may need to tweak the number 123 in the for > loop above or the location of files if you don't have these - they > should be there, but if not pick any file with some log-like text in > it (a few hundred lines should be enough) > > - The variants are just quick and dirty ways to reproduce - crashes > happen in regular work in various situations (i.e. real scenarios, not > contrived as above) > > - Crashes can be reproduced always > > - Taking the same steps as above when running from Cygwin terminal > (i.e. the one that comes bundled with Cygwin itself) does not result > in a crash -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
Re: Core dump on 32-bit Cygwin if program calls dlopen
On 7/16/2014 15:02, Corinna Vinschen wrote: > Hi JonY, > > On Jul 15 16:39, Corinna Vinschen wrote: >> On Jul 15 21:55, JonY wrote: >>> On 7/15/2014 21:08, Corinna Vinschen wrote: > > FWIW, the problem disappears if I revert gcc-core and libgcc1 to 4.8.2-2. JonY, do you have a chance to have a look into this issue? >>> >>> Sorry, I have been busy these few weeks, but I am well aware that there >>> is a problem with one of the libgcc changes, but has yet to investigate it. >>> >>> I believe Jon Turney has looked into it somewhat. >> >> Sounds good. Thanks in advance. > > Yesterday I asked my collegues to take a stab at the issue and one of > them, DJ Delorie, came up with a libgcc patch already. It hasn't been > sent upstream yet. Can we give it a try, perhaps by creating a new > libgcc DLL, please? > Thanks, I'll get to it this weekend, should I make the new gcc an experimental version? Or is just the libgcc binary required? signature.asc Description: OpenPGP digital signature
Re: Some programs (vi, ssh) crash when screen buffer height is big
On Wed, Jul 16, 2014 at 04:29:54PM -0400, sous lesquels wrote: >A few more things to add: > >- This crashes under the regular Windows console, i.e. run cmd.exe, >then bash, then follow the above You've discovered that Cygwin has limits. You can't run it with console windows that are too big. Sorry. cgf -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
Re: timeout in LDAP access
On 2014-07-16 15:51, Corinna Vinschen wrote: > It occured to me that there's another way to do that. The problem > you're mentioning above could be alleviated if the first Cygwin process > in a process tree fetches all POSIX offsets of all trusted domains right > at the start, rather than fetching the POSIX offsets only on demand by > whatever process needs it. This would slow down the startup of the > first process slightly (one LDAP request per trusted domain, but only > asking your primary DC), but this would have two advantages: > > - After fetching all POSIX offsets, we could filter out all POSIX > offsets which don't make sense. These would be set using the fake > offset setting mechanism. "No sense" would include offsets < 0x11 > or offsets > 0xff00. If the first process in the tree > > - The UID/GID values would be stable throughout the process tree. > > - The UID/GID values would be stable systemwide when utilizing cygserver. > > That's a bit of work, but Cygwin 1.7.31 will still come without this > AD integration code anyway, so we still have time to turn everything > upside down. I buy this of course, but i’m still not convinced that we have to workaround. After all, since i don’t care the other domains in my daily work, i’m not affected at all. Most of the users will never be affected i suppose. And if Cygwin happens to circumvent a null posixOffset by providing its own, there will be even less chances for collisions and for collisions being reported. But we can consider the other way and for that i will use a comparison: using special characters (like ‘\n’) gratuitously in the middle of filenames is usually considered as a bad practice, but always possible by doing ‘char *filename = "a\nb"; fopen(filename, "w")’. Now, once this file is created, you can use ‘ls’ in the folder. Do you think ‘ls' should respect user decision and display the raw \n in its output or try to workaround by using some substitution character (like ‘?’) in order not to wrap at unexpected locations? The answer is that ‘ls’ substitutes by default, but also provides a full group of related options to change this behavior (--quoting-style=WORD, --hide-control-chars). Of course, adding options (eg in nsswitch.conf) to orientate the assignment of posixOffsets to various substitutes would be useless. Even assigning the null posixOffsets to non-null values, i’m not convinced of. Denis Excoffier. -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple