I'm not making any function calls in the assembly, just writing to memory addresses that represent the elements / len of the slice. I've also tried using LockOSThread() to see if that made any difference, alas it does not.
On Friday, March 22, 2019 at 4:59:30 AM UTC-7, Robert Engels wrote: > > Are you making any calls modifying the len that would allow GC to occur, > or change stack size? You might need to pin the Go routine so that the > operation you are performing is “atomic” with respect to those. > > This also sounds very scary if the Go runtime every had a compacting > collector. > > On Mar 22, 2019, at 12:27 AM, Tom <hype...@gmail.com <javascript:>> wrote: > > The allocation is in go, and assembly never modifies the size of the > backing array. Assembly only ever modifies len, which is the len of the > slice and not the backing array. > > On Thursday, 21 March 2019 22:18:29 UTC-7, Tamás Gulácsi wrote: >> >> 2019. március 22., péntek 6:06:06 UTC+1 időpontban Tom a következőt írta: >>> >>> Still errors I'm afraid :/ >>> >>> On Thursday, 21 March 2019 21:54:59 UTC-7, Ian Lance Taylor wrote: >>>> >>>> On Thu, Mar 21, 2019 at 9:39 PM Tom <hype...@gmail.com> wrote: >>>> > >>>> > I've been stuck on this for a few days so thought I would ask the >>>> brains trust. >>>> > >>>> > TL;DR: When I have native amd64 instructions mutating (updating the >>>> len + values of a []uint64) a slice, I experience spurious & random memory >>>> corruption when under heavy load (# runnable goroutines > MAXPROCS, doing >>>> the same thing continuously), and only when the GC is enabled. Any >>>> debugging ideas or things I should look into? >>>> > >>>> > Background: >>>> > >>>> > I'm calling into go assembly with a few pointers to slices >>>> (*[]uint64), and that assembly is mutating them (reading/writing values, >>>> updating len within capacity). I'm experiencing random memory corruption, >>>> but I can only trigger it in the following scenarios: >>>> > >>>> > Heavy load - Doing a zillion things at once (specifically running all >>>> my test cases in parallel) and maxing out my machine. >>>> > Parallelism - A panic due to memory corruption happens faster if >>>> --parallel is set higher, and never if not in parallel. >>>> > GC - The panic never happens if the GC is disabled (of course, the >>>> test process eventually runs out of memory). >>>> > >>>> > The memory corruption varies, but usually results in an element of an >>>> unrelated slice being zero'ed, the len of a unrelated slice being zeroed, >>>> or (less likely) a segfault. >>>> > >>>> > Tested on go1.11.2 and go1.12.1. I can only trigger this if I run all >>>> my test cases at once (with --count at 8000 or so & using t.Parallel()). >>>> Running thing serially or individually yields the correct behaviour. >>>> > >>>> > The assembly in question looks like this: >>>> > >>>> > TEXT ·jitcall(SB),NOSPLIT|NOFRAME,$0-24 >>>> > GO_ARGS >>>> > MOVQ asm+0(FP), AX // Load the address of the assembly >>>> section. >>>> > MOVQ stack+8(FP), R10 // Load the address of the 1st slice. >>>> > MOVQ locals+16(FP), R11 // Load the address of the 2nd slice. >>>> > MOVQ 0(AX), AX // Deference pointer to native code. >>>> > JMP AX // Jump to native code. >>>> > >>>> > And slice manipulation like this (this is a 'pop'): >>>> > >>>> > MOVQ r13, [r10+8] // Load the length of the slice. >>>> > DECQ r13 // Decrements the len (I can guarantee >>>> this will never underflow). >>>> > MOVQ r12, [r10] // Load the 0th element address. >>>> > LEAQ r12, [r12 + r13*8] // Compute the address of the last >>>> element. >>>> > MOVQ reg, [r12] // Load the element to reg. >>>> > MOVQ [r10+8], r13 // Write the len back. >>>> > >>>> > or 'push' like this (note: cap is always large enough for any pushes) >>>> ... >>>> > >>>> > MOVQ r12, [r10] // Load the 0th element address. >>>> > MOVQ r13, [r10+8] // Load the len. >>>> > LEAQ r12, [r12 + r13*8] // Compute the address of the last >>>> element + 1. >>>> > INCQ r13 // Increment the len. >>>> > MOVQ [r10+8], r13 // Save the len. >>>> > MOVQ [r12], reg // Write the new element. >>>> > >>>> > >>>> > I acknowledge that calling into code like this is unsupported, but I >>>> struggle to understand how such corruption can happen, and having stared >>>> at >>>> it for a few days, I am frankly stumped. I mean, even if non-cooperative >>>> preemption was in these versions of Go I would expect the GC to abort >>>> when >>>> it cant find the stack maps for my RIP value. With no GC safe points in my >>>> native assembly, I dont see how the GC could interfere (yet the issue >>>> disappears with the GC off??). >>>> > >>>> > Questions: >>>> > >>>> > Any ideas what I'm doing wrong? >>>> > Any ideas how I can trace this from the application side and also the >>>> runtime side? I've tried schedtrace and the like, but the output didnt >>>> appear useful or correlated to the crashes. >>>> > Any suggestions for assumptions I might have missed and should write >>>> tests / guards for? >>>> >>>> >>>> >> Do the allocation in Go, don't modify the slice's backing array's length >> outside of Go - the runtime won't know about it and happily allocate over >> the grown slice. >> >> > -- > You received this message because you are subscribed to the Google Groups > "golang-nuts" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to golang-nuts...@googlegroups.com <javascript:>. > For more options, visit https://groups.google.com/d/optout. > > -- You received this message because you are subscribed to the Google Groups "golang-nuts" group. To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.