Re: rtorrent and recent snapshots - apparent problem with msync()
On Jan 17 19:35, Christopher Faylor wrote: > On Thu, Jan 17, 2013 at 04:42:36PM -0500, Chris Sutcliffe wrote: > >On 17 January 2013 06:55, Chris Sutcliffe wrote: > >> On 17 January 2013 01:24, Christopher Faylor wrote: > >>> Is there a stackdump file? If not, I guess if you could make a strace > >>> available for download that could be useful. > >> > >> The current snapshots do not produce a stackdump. I will execute an > >> strace this evening when I get home and provide it. > > > >I've uploaded the strace for this issue here: > > > >http://dl.dropbox.com/u/5530441/cygwin/rtorrent.strace > > > >Please let me know if there is anything else I can do to help. > > Thanks. That helped. > > msync() is failing with an EACCESS errno. That translates to a windows > error: ERROR_LOCK_VIOLATION. According to the ancient wisdom of google, > it is not uncommon for the FlushViewOfFile() function to return with > this error in some cases. > > I added a retry to the function fhandler_disk_file::msync and tried > running rtorrent to download a debian iso (which seemed to be what you > were doing). I could duplicate your problem before adding the retry but > I don't see it now. > > The command I was using: > > rtorrent > http://cdimage.debian.org/debian-cd/6.0.6/i386/bt-cd/debian-6.0.6-i386-CD-1.iso.torrent > > I'm generating a snapshot now. Please give it a try when it shows up. > > And, Corinna, please if the change I made to your function is wrong or > you just don't like my variable names or comments please feel free to > expunge what I did with extreme prejudice. Looks good to me. I'm just wondering. I have a similar piece of code in the rename function in syscalls.cc, lines 2342ff. This loop also allows signals to break the loop. Maybe we should do the same here? I just read the Linux msync man page(*) as well as the MSDN FlushViewOfFile man page(**). Looks like this function is missing a bit of functionality. Right now msync only calls FlushViewOfFile. Per MSDN this is equivalent to msync called with the MS_ASYNC flag. If the MS_SYNC flag is given, the function should also call FlushFileBuffers. I'll fix that. Also, Linux msync is allowed to return with EBUSY if "MS_INVALIDATE was specified in flags, and a memory lock exists for the specified address range." That seems to match our situation... except that rtorrent doesn't use the MS_INVALIDATE flag. Either way, maybe we should translate ERROR_LOCK_VIOLATION to EBUSY? (*) http://www.kernel.org/doc/man-pages/online/pages/man2/msync.2.html (**) http://msdn.microsoft.com/en-us/library/windows/desktop/aa366563%28v=vs.85%29.aspx > I can't explain what in the newer snapshots would cause a difference > in behavior other than the fact that they were being built with a > newer compiler and a revamped configure script. I tried with my gcc 4.5.3 build and I can't reproduce the problem. Still, it's just calls to OS functions. There should be no compiler induced difference in the error values returned from OS functions. Except your gcc produces faster code than WIndows allows ;) Corinna -- Corinna Vinschen Please, send mails regarding Cygwin to Cygwin Project Co-Leader cygwin AT cygwin DOT com Red Hat -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
Re: Binutils objcopy bug (was Re: rebase segfault)
On 1/16/2013 1:35 PM, Corinna Vinschen wrote: As far as I can tell it's an objcopy bug. The stripped version of the DLL has a normal relocation information which at one point ends in a NULL IMAGE_BASE_RELOCATION record, as expected. After calling `objcopy --add-gnu-debuglink', the relocation information is supposed to be the same as before, since the relocatable file content didn't change. Nevertheless, when stepping through the relocator code in rebase, it turns out that the former NULL IMAGE_BASE_RELOCATION record does not contain only 0 values anymore. Rather, it has been overwritten with some random(?) non-0 values, which rebase correctly interprets as the start of the next IMAGE_BASE_RELOCATION array. So rebase blunders along, thus either just SEGVing, if everything goes well, or, worst case, overwriting formerly correct information in the file with arbitrary data. This is a serious bug in objcopy in the current binutils. Given that cygport creates the debug info automatically, we might end up with spuriously broken DLLs in the distro. I checked with objcopy from the older binutils 2.51.53-2, and the problem did not show up. I also built the latest binutils release 2.23.1 and the problem also doesn't show, so we probably can get away with just a black eye by updating binutils to 2.23.1. Chris? Corinna Chris, any news ? Marco -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
Re: rtorrent and recent snapshots - apparent problem with msync()
On Fri, Jan 18, 2013 at 10:32:05AM +0100, Corinna Vinschen wrote: >On Jan 17 19:35, Christopher Faylor wrote: >> And, Corinna, please if the change I made to your function is wrong or >> you just don't like my variable names or comments please feel free to >> expunge what I did with extreme prejudice. > >Looks good to me. I'm just wondering. I have a similar piece of code >in the rename function in syscalls.cc, lines 2342ff. This loop also >allows signals to break the loop. Maybe we should do the same here? > >I just read the Linux msync man page(*) as well as the MSDN >FlushViewOfFile man page(**). Looks like this function is missing a >bit of functionality. Right now msync only calls FlushViewOfFile. >Per MSDN this is equivalent to msync called with the MS_ASYNC flag. >If the MS_SYNC flag is given, the function should also call FlushFileBuffers. >I'll fix that. > >Also, Linux msync is allowed to return with EBUSY if "MS_INVALIDATE was >specified in flags, and a memory lock exists for the specified address >range." That seems to match our situation... except that rtorrent >doesn't use the MS_INVALIDATE flag. Either way, maybe we should >translate ERROR_LOCK_VIOLATION to EBUSY? > >(*) http://www.kernel.org/doc/man-pages/online/pages/man2/msync.2.html >(**) >http://msdn.microsoft.com/en-us/library/windows/desktop/aa366563%28v=vs.85%29.aspx That sounds right to me. EACCES didn't seem like the right translation here. >> I can't explain what in the newer snapshots would cause a difference >> in behavior other than the fact that they were being built with a >> newer compiler and a revamped configure script. > >I tried with my gcc 4.5.3 build and I can't reproduce the problem. >Still, it's just calls to OS functions. There should be no compiler >induced difference in the error values returned from OS functions. >Except your gcc produces faster code than WIndows allows ;) Yeah, my compiler setup is great. Now if I could just use it to compile my packages, I'd be very happy. So far it only seems to work right with Cygwin. Other stuff, like gdb and binutils are currently problematic. I saw that you made another change to this function. Is it possible that this might actually fix the "rtorrent problem"? cgf -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
Re: Binutils objcopy bug (was Re: rebase segfault)
On Fri, Jan 18, 2013 at 04:34:25PM +0100, marco atzeri wrote: >On 1/16/2013 1:35 PM, Corinna Vinschen wrote: >> >> As far as I can tell it's an objcopy bug. >> >> The stripped version of the DLL has a normal relocation information >> which at one point ends in a NULL IMAGE_BASE_RELOCATION record, as >> expected. After calling `objcopy --add-gnu-debuglink', the relocation >> information is supposed to be the same as before, since the relocatable >> file content didn't change. >> >> Nevertheless, when stepping through the relocator code in rebase, it >> turns out that the former NULL IMAGE_BASE_RELOCATION record does not >> contain only 0 values anymore. Rather, it has been overwritten with >> some random(?) non-0 values, which rebase correctly interprets as the >> start of the next IMAGE_BASE_RELOCATION array. So rebase blunders >> along, thus either just SEGVing, if everything goes well, or, worst >> case, overwriting formerly correct information in the file with >> arbitrary data. >> >> This is a serious bug in objcopy in the current binutils. Given that >> cygport creates the debug info automatically, we might end up with >> spuriously broken DLLs in the distro. >> >> I checked with objcopy from the older binutils 2.51.53-2, and the >> problem did not show up. I also built the latest binutils release >> 2.23.1 and the problem also doesn't show, so we probably can get away >> with just a black eye by updating binutils to 2.23.1. Chris? >> >> >> Corinna >> > >Chris, >any news ? Nope. cgf -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
Re: rtorrent and recent snapshots - apparent problem with msync()
On Jan 18 10:43, Christopher Faylor wrote: > On Fri, Jan 18, 2013 at 10:32:05AM +0100, Corinna Vinschen wrote: > >On Jan 17 19:35, Christopher Faylor wrote: > >> And, Corinna, please if the change I made to your function is wrong or > >> you just don't like my variable names or comments please feel free to > >> expunge what I did with extreme prejudice. > > > >Looks good to me. I'm just wondering. I have a similar piece of code > >in the rename function in syscalls.cc, lines 2342ff. This loop also > >allows signals to break the loop. Maybe we should do the same here? > > > >I just read the Linux msync man page(*) as well as the MSDN > >FlushViewOfFile man page(**). Looks like this function is missing a > >bit of functionality. Right now msync only calls FlushViewOfFile. > >Per MSDN this is equivalent to msync called with the MS_ASYNC flag. > >If the MS_SYNC flag is given, the function should also call FlushFileBuffers. > >I'll fix that. > > > >Also, Linux msync is allowed to return with EBUSY if "MS_INVALIDATE was > >specified in flags, and a memory lock exists for the specified address > >range." That seems to match our situation... except that rtorrent > >doesn't use the MS_INVALIDATE flag. Either way, maybe we should > >translate ERROR_LOCK_VIOLATION to EBUSY? > > > >(*) http://www.kernel.org/doc/man-pages/online/pages/man2/msync.2.html > >(**) > >http://msdn.microsoft.com/en-us/library/windows/desktop/aa366563%28v=vs.85%29.aspx > > That sounds right to me. EACCES didn't seem like the right translation > here. Ok, I'll fix that in errno.cc. > >> I can't explain what in the newer snapshots would cause a difference > >> in behavior other than the fact that they were being built with a > >> newer compiler and a revamped configure script. > > > >I tried with my gcc 4.5.3 build and I can't reproduce the problem. > >Still, it's just calls to OS functions. There should be no compiler > >induced difference in the error values returned from OS functions. > >Except your gcc produces faster code than WIndows allows ;) > > Yeah, my compiler setup is great. Now if I could just use it to compile > my packages, I'd be very happy. So far it only seems to work right with > Cygwin. Other stuff, like gdb and binutils are currently problematic. Current binutils CVS HEAD doesn't build on Cygwin(*). The 2.23.1 version should work, though. > I saw that you made another change to this function. Is it possible that > this might actually fix the "rtorrent problem"? No. It only adds the MS_SYNC handling. rtorrent uses MS_ASYNC. I think there's basically no way around the loop. I'm just still wondering if we shouldn't add a cygwait() call to handle signals during the wait time. Corinna (*) http://sourceware.org/ml/binutils/2013-01/msg00303.html -- Corinna Vinschen Please, send mails regarding Cygwin to Cygwin Project Co-Leader cygwin AT cygwin DOT com Red Hat -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
Re: rtorrent and recent snapshots - apparent problem with msync()
On Fri, Jan 18, 2013 at 05:07:42PM +0100, Corinna Vinschen wrote: >Current binutils CVS HEAD doesn't build on Cygwin(*). The 2.23.1 >version should work, though. I have a number of weird problems building binutils that will take some time to track down. cgf -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
Re: Intermittent failures with ctrl-c
On 01/16/2013 05:23 PM, Christopher Faylor wrote: On Wed, Jan 16, 2013 at 03:18:47PM -0500, Tom Honermann wrote: I managed to duplicate a hang by changing your .bat file to use "sleep 2" rather than false. I'm investigating now. I noticed that you checked in some additional changes on the 16th that look related to this, so I tested again with today's snapshot (20130118). I was still able to produce hangs using the same test case. The symptoms are slightly different than I had seen previously. bash hung 2 out of the ~60 times I interrupted the test. No error messages were displayed this time. Upon pressing ctrl-c, bash hung for 60 seconds. I was then greeted with the "Terminate batch job" prompt and responding 'Y' terminated the process tree as expected. Pressing ctrl-c while bash was hung for that 60 seconds appeared to have no affect. My apologies for this distraction if you don't yet expect this to be fixed. Tom. -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
Re: rtorrent and recent snapshots - apparent problem with msync()
Greetings, Corinna Vinschen! >> I saw that you made another change to this function. Is it possible that >> this might actually fix the "rtorrent problem"? > No. It only adds the MS_SYNC handling. rtorrent uses MS_ASYNC. That made me think... If rtorrent uses MS_ASYNC, shouldn't *rtorrent* be prepared for consequences? Instead of you trying to satisfy its expectations? > I think there's basically no way around the loop. I'm just still > wondering if we shouldn't add a cygwait() call to handle signals > during the wait time. -- WBR, Andrey Repin (anrdae...@freemail.ru) 19.01.2013, <00:15> Sorry for my terrible english... -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
Re: rtorrent and recent snapshots - apparent problem with msync()
On Sat, Jan 19, 2013 at 12:18:39AM +0400, Andrey Repin wrote: >Greetings, Corinna Vinschen! > >>> I saw that you made another change to this function. Is it possible that >>> this might actually fix the "rtorrent problem"? > >> No. It only adds the MS_SYNC handling. rtorrent uses MS_ASYNC. > >That made me think... If rtorrent uses MS_ASYNC, shouldn't *rtorrent* be >prepared for consequences? Instead of you trying to satisfy its >expectations? It does seem that way. cgf -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
Re: rtorrent and recent snapshots - apparent problem with msync()
On Fri, Jan 18, 2013 at 04:10:25PM -0500, Christopher Faylor wrote: >On Sat, Jan 19, 2013 at 12:18:39AM +0400, Andrey Repin wrote: >>Greetings, Corinna Vinschen! >> I saw that you made another change to this function. Is it possible that this might actually fix the "rtorrent problem"? >> >>> No. It only adds the MS_SYNC handling. rtorrent uses MS_ASYNC. >> >>That made me think... If rtorrent uses MS_ASYNC, shouldn't *rtorrent* be >>prepared for consequences? Instead of you trying to satisfy its >>expectations? > >It does seem that way. Actually, it isn't that clear. It seems like msync is failing in the MS_ASYNC case when it shouldn't be, i.e., rtorrent is within its rights to expect the operation to succeed. -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
Re: Intermittent failures with ctrl-c
On Fri, Jan 18, 2013 at 03:11:03PM -0500, Tom Honermann wrote: >On 01/16/2013 05:23 PM, Christopher Faylor wrote: >> On Wed, Jan 16, 2013 at 03:18:47PM -0500, Tom Honermann wrote: >> I managed to duplicate a hang by changing your .bat file to use "sleep >> 2" rather than false. I'm investigating now. > >I noticed that you checked in some additional changes on the 16th that >look related to this, so I tested again with today's snapshot (20130118). I thought I sent a "try a snapshot" but I must have been hallucinating again. >I was still able to produce hangs using the same test case. The >symptoms are slightly different than I had seen previously. bash hung 2 >out of the ~60 times I interrupted the test. No error messages were >displayed this time. Upon pressing ctrl-c, bash hung for 60 seconds. I >was then greeted with the "Terminate batch job" prompt and responding >'Y' terminated the process tree as expected. Pressing ctrl-c while bash >was hung for that 60 seconds appeared to have no affect. The hang should be fixed in the upcoming snapshot. cgf -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple