Disheartening, but not unfamiliar - very reminiscent of Java days. I would 
highly encourage removing any dependencies that use CGO or Unsafe - and move to 
pure Go. This seemed to be the only way to tame these sort of errors in the 
wild in Java land. 

Also, have you done stress tests using the race detector? I’m betting that the 
vast majority of the source of errors is incorrect concurrency. 

> On Aug 25, 2019, at 3:08 AM, Jakob Borg <ja...@kastelo.net> wrote:
> 
> Hi all,
> 
> We develop an open source program for consumers that has a reasonably large 
> usage within its niche, on a mix of operating systems and platforms. Recently 
> we enabled crash reporting to get panic traces back from cooperating users. 
> With that we've discovered a bunch of panics of our own creation, plus a lot 
> of noise in terms of fatal errors outside of our control -- typically users 
> running out of memory or threads.
> 
> There remains a lot of "unexplained" oddness however, some of which I'm sure 
> is attributable to hardware errors (bad RAM/CPU/etc). It's hard to be sure 
> either way, but we get a lot of stacks. The list below is a (probably 
> non-exhaustive) selection of crashes from the last week or so that are odd in 
> my mind:
> fatal error: defer on system stack
> fatal error: fatal error: unexpected signal during runtime execution
> fatal error: found bad pointer in Go heap (incorrect use of unsafe or cgo?) 
> (this could be ours, though we have no cgo I'm sure there is unsafe deep in 
> the dependencies)
> fatal error: gc: unswept span
> fatal error: malloc deadlock
> fatal error: mSpanList.insertBack
> fatal error: non in-use span in unswept list
> fatal error: out of memory allocating heap arena metadata (I guess this is 
> just a niche case of OOM)
> fatal error: runtime: stack split at bad time
> fatal error: runtime.newosproc (out of threads?)
> fatal error: runtime·unlock: lock count
> fatal error: s.allocCount != s.nelems && freeIndex == s.nelems
> fatal error: slice bounds out of range (deep in the malloc code)
> fatal error: stopm holding locks
> fatal error: sweep increased allocation count
> fatal error: sync: inconsistent mutex state
> fatal error: wirep: invalid p state
> panic: sync: inconsistent mutex state
> I'm not going to spend any energy hunting these down or pester with bug 
> reports, especially as I have no idea who the originating user is and no way 
> to communicate with them or experiment. :) However, if there's anyone of you 
> out there who think "Huh? That GC error should never happen, wonder what's 
> going on?" I would be happy to forward a bunch of crashes for that particular 
> crash or provide access to the crash database for searching.
> 
> (A limitation of our crash reporting is that output prior to the panic/fatal 
> error is trimmed as potentially sensitive user data. This means we miss the 
> description that some fatal-error crashes print before the "fatal error:" 
> line. We might fix this at some point.)
> 
> //jb
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "golang-nuts" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to golang-nuts+unsubscr...@googlegroups.com.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/golang-nuts/CF271E82-EF60-4808-B678-FDBC70DEAAFD%40kastelo.net.

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/golang-nuts/B7C87B44-400B-4F4F-98F7-EBEE094CD457%40ix.netcom.com.

Reply via email to