2019. március 22., péntek 6:06:06 UTC+1 időpontban Tom a következőt írta:
>
> Still errors I'm afraid :/
>
> On Thursday, 21 March 2019 21:54:59 UTC-7, Ian Lance Taylor wrote:
>>
>> On Thu, Mar 21, 2019 at 9:39 PM Tom <hype...@gmail.com> wrote: 
>> > 
>> > I've been stuck on this for a few days so thought I would ask the 
>> brains trust. 
>> > 
>> > TL;DR: When I have native amd64 instructions mutating (updating the len 
>> + values of a []uint64) a slice, I experience spurious & random memory 
>> corruption when under heavy load (# runnable goroutines > MAXPROCS, doing 
>> the same thing continuously), and only when the GC is enabled. Any 
>> debugging ideas or things I should look into? 
>> > 
>> > Background: 
>> > 
>> > I'm calling into go assembly with a few pointers to slices (*[]uint64), 
>> and that assembly is mutating them (reading/writing values, updating len 
>> within capacity). I'm experiencing random memory corruption, but I can only 
>> trigger it in the following scenarios: 
>> > 
>> > Heavy load - Doing a zillion things at once (specifically running all 
>> my test cases in parallel) and maxing out my machine. 
>> > Parallelism - A panic due to memory corruption happens faster if 
>> --parallel is set higher, and never if not in parallel. 
>> > GC - The panic never happens if the GC is disabled (of course, the test 
>> process eventually runs out of memory). 
>> > 
>> > The memory corruption varies, but usually results in an element of an 
>> unrelated slice being zero'ed, the len of a unrelated slice being zeroed, 
>> or (less likely) a segfault. 
>> > 
>> > Tested on go1.11.2 and go1.12.1. I can only trigger this if I run all 
>> my test cases at once (with --count at 8000 or so & using t.Parallel()). 
>> Running thing serially or individually yields the correct behaviour. 
>> > 
>> > The assembly in question looks like this: 
>> > 
>> > TEXT ·jitcall(SB),NOSPLIT|NOFRAME,$0-24 
>> >         GO_ARGS 
>> >         MOVQ asm+0(FP),     AX  // Load the address of the assembly 
>> section. 
>> >         MOVQ stack+8(FP),   R10 // Load the address of the 1st slice. 
>> >         MOVQ locals+16(FP), R11 // Load the address of the 2nd slice. 
>> >         MOVQ 0(AX),         AX  // Deference pointer to native code. 
>> >         JMP AX                  // Jump to native code. 
>> > 
>> > And slice manipulation like this (this is a 'pop'): 
>> > 
>> >  MOVQ r13,     [r10+8]       // Load the length of the slice. 
>> >  DECQ r13                    // Decrements the len (I can guarantee 
>> this will never underflow). 
>> >  MOVQ r12,     [r10]         // Load the 0th element address. 
>> >  LEAQ r12,     [r12 + r13*8] // Compute the address of the last 
>> element. 
>> >  MOVQ reg,     [r12]         // Load the element to reg. 
>> >  MOVQ [r10+8], r13           // Write the len back. 
>> > 
>> > or 'push' like this (note: cap is always large enough for any pushes) 
>> ... 
>> > 
>> >  MOVQ r12,     [r10]          // Load the 0th element address. 
>> >  MOVQ r13,     [r10+8]        // Load the len. 
>> >  LEAQ r12,     [r12 + r13*8]  // Compute the address of the last 
>> element + 1. 
>> >  INCQ r13                     // Increment the len. 
>> >  MOVQ [r10+8], r13            // Save the len. 
>> >  MOVQ [r12],   reg            // Write the new element. 
>> > 
>> > 
>> > I acknowledge that calling into code like this is unsupported, but I 
>> struggle to understand how such corruption can happen, and having stared at 
>> it for a few days, I am frankly stumped. I mean, even if non-cooperative 
>> preemption was in these versions of Go I would expect the GC to  abort when 
>> it cant find the stack maps for my RIP value. With no GC safe points in my 
>> native assembly, I dont see how the GC could interfere (yet the issue 
>> disappears with the GC off??). 
>> > 
>> > Questions: 
>> > 
>> > Any ideas what I'm doing wrong? 
>> > Any ideas how I can trace this from the application side and also the 
>> runtime side? I've tried schedtrace and the like, but the output didnt 
>> appear useful or correlated to the crashes. 
>> > Any suggestions for assumptions I might have missed and should write 
>> tests / guards for? 
>>
>>
>>
Do the allocation in Go, don't modify the slice's backing array's length 
outside of Go - the runtime won't know about it and happily allocate over 
the grown slice. 
 

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to