after upgrading to 1.9 (50% reduction) and finding a data race we didn't 
see in testing,
we're still hunting down about 1 crash per 67 million hours of runtime.

needless to say, the economical thing to do is to ignore the problem, but 
it sure bugs (!)
me.  the correct number of crashes is 0.

- erik

On Tuesday, December 5, 2017 at 4:37:46 PM UTC-8, Erik Quanstrom wrote:
>
> the failure rate is high enough to be motivating.  :-)  i now have go 1.9 
> working for
> production builds.  i will report back with results as soon as i have them.
>
> - erik
>
> On Monday, December 4, 2017 at 6:16:29 PM UTC-8, Ian Lance Taylor wrote:
>>
>> On Sun, Dec 3, 2017 at 6:42 PM,  <quan...@gmail.com> wrote: 
>> > 
>> > i'm running go 1.8.3 on linux.  about 1 in ~10 billion calls (spread 
>> across 
>> > many machines), i get 
>> > a backtrace that looks like the following: 
>> > 
>> > panic: runtime error: invalid memory address or nil pointer dereference 
>> > [signal SIGSEGV: segmentation violation code=0x1 addr=0x28 pc=0x4c3406] 
>> > goroutine 441026 [running]: 
>> > bufio.(*Reader).ReadSlice(0x0, 0xa, 0x30, 0xc420256ec8, 0xc4201b9ef0, 
>> > 0xc420315580, 0x0) 
>> > #011/usr/lib/golang/src/bufio/bufio.go:316 +0x26 
>> > bufio.(*Reader).ReadLine(0x0, 0xa1f378, 0x30, 0xc4201b9ef0, 0x30, 
>> > 0xc4201b9ef0, 0x30) 
>> > #011/usr/lib/golang/src/bufio/bufio.go:367 +0x37 
>> > fakexpkg.(*Xpkg).tableParse(0xffffffffffffffff, 0x0, 0x0) 
>> > <----- HERE 
>> > #011/builddir/build/BUILD/posthoc-1.1/src/fakexpkg/xpkg.go:175 +0x86 
>> > created by fakexpkg.(*Xpkg).List 
>> > #011/builddir/build/BUILD/posthoc-1.1/src/fakexpkg/xpkg.go:225 +0x2ac 
>> > 
>> > the code calling tableparse looks something like this.  no references 
>> to 
>> > anything 
>> > table at the end are kept in other places. 
>> > 
>> > ret := make(chan []*Info, 1) 
>> > go x.tableParse(bufio.NewReader(outpipe), ret) 
>> > table := <-ret 
>> > 
>> > other c programs that i run on these same machines do not core dump at 
>> all. 
>> > 
>> > since there is no use of the unsafe package anywhere in this program, 
>> i'm 
>> > confused as to 
>> > how the Xpkg receiver could be -1 unless something has gone wrong with 
>> the 
>> > runtime. 
>> > i feel like i must be missing something though, since it's never the 
>> layer 
>> > below. 
>> > 
>> > does anyone have any idea what's going on here, or some hints on 
>> debugging 
>> > this? 
>>
>> Assuming that the race detector doesn't report any problems, this is a 
>> strange example of memory corruption: not only is the receiver pointer 
>> invalid, the two arguments to the function are nil even though the 
>> code fragment shows that that is not possible.  From your description 
>> the problem is very very rare.  How much time and energy do you have 
>> for experimentation?  If you have some, the first step is certainly to 
>> try using Go 1.9, as various bugs have been fixed.  If it still 
>> happens the same way, the next step is to try to reduce the program to 
>> a self-contained example that you can share.  At a guess given that 
>> three words appear to be corrupt, it may have something to do with the 
>> way that goroutine arguments are saved.  I don't see any way that 
>> could fail, but then this is a very rare problem. 
>>
>> Ian 
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to