Still errors I'm afraid :/ On Thursday, 21 March 2019 21:54:59 UTC-7, Ian Lance Taylor wrote: > > On Thu, Mar 21, 2019 at 9:39 PM Tom <hype...@gmail.com <javascript:>> > wrote: > > > > I've been stuck on this for a few days so thought I would ask the brains > trust. > > > > TL;DR: When I have native amd64 instructions mutating (updating the len > + values of a []uint64) a slice, I experience spurious & random memory > corruption when under heavy load (# runnable goroutines > MAXPROCS, doing > the same thing continuously), and only when the GC is enabled. Any > debugging ideas or things I should look into? > > > > Background: > > > > I'm calling into go assembly with a few pointers to slices (*[]uint64), > and that assembly is mutating them (reading/writing values, updating len > within capacity). I'm experiencing random memory corruption, but I can only > trigger it in the following scenarios: > > > > Heavy load - Doing a zillion things at once (specifically running all my > test cases in parallel) and maxing out my machine. > > Parallelism - A panic due to memory corruption happens faster if > --parallel is set higher, and never if not in parallel. > > GC - The panic never happens if the GC is disabled (of course, the test > process eventually runs out of memory). > > > > The memory corruption varies, but usually results in an element of an > unrelated slice being zero'ed, the len of a unrelated slice being zeroed, > or (less likely) a segfault. > > > > Tested on go1.11.2 and go1.12.1. I can only trigger this if I run all my > test cases at once (with --count at 8000 or so & using t.Parallel()). > Running thing serially or individually yields the correct behaviour. > > > > The assembly in question looks like this: > > > > TEXT ·jitcall(SB),NOSPLIT|NOFRAME,$0-24 > > GO_ARGS > > MOVQ asm+0(FP), AX // Load the address of the assembly > section. > > MOVQ stack+8(FP), R10 // Load the address of the 1st slice. > > MOVQ locals+16(FP), R11 // Load the address of the 2nd slice. > > MOVQ 0(AX), AX // Deference pointer to native code. > > JMP AX // Jump to native code. > > > > And slice manipulation like this (this is a 'pop'): > > > > MOVQ r13, [r10+8] // Load the length of the slice. > > DECQ r13 // Decrements the len (I can guarantee this > will never underflow). > > MOVQ r12, [r10] // Load the 0th element address. > > LEAQ r12, [r12 + r13*8] // Compute the address of the last element. > > MOVQ reg, [r12] // Load the element to reg. > > MOVQ [r10+8], r13 // Write the len back. > > > > or 'push' like this (note: cap is always large enough for any pushes) > ... > > > > MOVQ r12, [r10] // Load the 0th element address. > > MOVQ r13, [r10+8] // Load the len. > > LEAQ r12, [r12 + r13*8] // Compute the address of the last element > + 1. > > INCQ r13 // Increment the len. > > MOVQ [r10+8], r13 // Save the len. > > MOVQ [r12], reg // Write the new element. > > > > > > I acknowledge that calling into code like this is unsupported, but I > struggle to understand how such corruption can happen, and having stared at > it for a few days, I am frankly stumped. I mean, even if non-cooperative > preemption was in these versions of Go I would expect the GC to abort when > it cant find the stack maps for my RIP value. With no GC safe points in my > native assembly, I dont see how the GC could interfere (yet the issue > disappears with the GC off??). > > > > Questions: > > > > Any ideas what I'm doing wrong? > > Any ideas how I can trace this from the application side and also the > runtime side? I've tried schedtrace and the like, but the output didnt > appear useful or correlated to the crashes. > > Any suggestions for assumptions I might have missed and should write > tests / guards for? > > See whether it helps to add runtime.KeepAlive calls for the slices and > any other pointers that you pass to the assembly code. If that fixes > the problem, then it's a liveness problem. > > Ian >
-- You received this message because you are subscribed to the Google Groups "golang-nuts" group. To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.