I managed to resolve the issue by allocating in non-GC memory, using 
persistentalloc.
Should anyone want to look into it in detail, here is the commit 
<https://gitlab.com/AbelThar/go.batch/commit/ed813d3728781e97999013503ef81b5e6f977e9d>

On Monday, 8 April 2019 20:50:48 UTC+2, Tharen Abela wrote:
>
> I am keeping an `allb` slice, and with that I did see it occasionally 
> succeed.
>
> I am using the binarytree 
> <https://gitlab.com/AbelThar/go.batch/blob/b10ef431c29b01fa7568a7bf9712a0286033266f/batching/src/runnables/binarytree.go>
>  
> test, since it is an issue regarding the GC.
> In fact running it with GOGC=off, or also keep a slice with pointers in 
> the program, does consistently succeed, as well.
>
> What I do know is when I allocate a batch, I keep a raw pointer in the 
> slice, and is never popped or removed from there at any point.
> P's will keep their own batch using a uintptr, and the others are stored 
> in either a global batch queue, or a queue of empty batches, the same as 
> gQueue but for the batch type *b*uintptr: all of which are irrelevant to 
> the GC.
>
> Now I modified the batch allocation to show me the pointer of `allb` and 
> the new batch allocated:
> // Allocate a new batch
> //go:nosplit
> //go:yeswritebarrierrec
> func allocb() *b {
>     // Break the cycle by doing acquirem/releasem around new(b).
>     // The acquirem/releasem increments m.locks during new(b),
>     // which keeps the garbage collector from being invoked.
>     mp := acquirem()
>
>     var bp *b
>     
>     bp = new(b)
>     allb = append(allb, bp)
>     print("allb: ", allb, ", bp:", bp, "\n")
>
>     releasem(mp)
>     return bp
> }
>
> With GOGC=off I get that 6 batches have been created
> GOGC=off GODEBUG=gccheckmark=1 gobatch run ./binarytree.go
> allb: [1/1]0xc000010010, bp:0xc0000160f0
> allb: [2/2]0xc000012010, bp:0xc000016100
> allb: [3/4]0xc00000e020, bp:0xc000016110
> allb: [4/4]0xc00000e020, bp:0xc000016120
> allb: [5/8]0xc000062000, bp:0xc000060000
> allb: [6/8]0xc000062000, bp:0xc000060010
>
> When it does succeed with the GC on, it consistently takes 13 batches, 
> which I find rather odd.
> GODEBUG=gccheckmark=1 gobatch run ./binarytree.go
> allb: [1/1]0xc000010010, bp:0xc0000160f0
> allb: [2/2]0xc000012010, bp:0xc000016100
> allb: [3/4]0xc00000e020, bp:0xc000016110
> allb: [4/4]0xc00000e020, bp:0xc000016120
> allb: [5/8]0xc000062000, bp:0xc000060000
> allb: [6/8]0xc000062000, bp:0xc000060010
> allb: [7/8]0xc000062000, bp:0xc000016150
> allb: [8/8]0xc000062000, bp:0xc0004b8000
> allb: [9/16]0xc000510000, bp:0xc00044a040
> allb: [10/16]0xc000510000, bp:0xc000514000
> allb: [11/16]0xc000510000, bp:0xc000540000
> allb: [12/16]0xc000510000, bp:0xc0004b8010
> allb: [13/16]0xc000510000, bp:0xc0004b8020
>
> Now when it crashes it returns the following: (full stack trace on 
> pastebin) <https://pastebin.com/40iYNQrh>
> GODEBUG=gccheckmark=1 gobatch run ./binarytree.go
> allb: [1/1]0xc000010010, bp:0xc0000160f0
> allb: [2/2]0xc000012010, bp:0xc000016100
> allb: [3/4]0xc00000e020, bp:0xc000016110
> allb: [4/4]0xc00000e020, bp:0xc000016120
> allb: [5/8]0xc00006a000, bp:0xc000068000
> allb: [6/8]0xc00006a000, bp:0xc000068010
> allb: [7/8]0xc00006a000, bp:0xc000016170
> allb: [8/8]0xc00006a000, bp:0xc0004b4000
> allb: [9/16]0xc0004ea000, bp:0xc0004b4010
> allb: [10/16]0xc0004ea000, bp:0xc0004ee000
> allb: [11/16]0xc0004ea000, bp:0xc000448040
> runtime: marking free object 0xc000448040 found at *(0xc0004ea000+0x50)
> base=0xc0004ea000 s.base()=0xc0004ea000 s.limit=0xc0004ec000 s.spanclass=
> 18 s.elemsize=128 s.state=mSpanInUse
>  *(base+0) = 0xc0000160f0
>  *(base+8) = 0xc000016100
>  *(base+16) = 0xc000016110
>  *(base+24) = 0xc000016120
>  *(base+32) = 0xc000068000
>  *(base+40) = 0xc000068010
>  *(base+48) = 0xc000016170
>  *(base+56) = 0xc0004b4000
>  *(base+64) = 0xc0004b4010
>  *(base+72) = 0xc0004ee000
>  *(base+80) = 0xc000448040 <==
>  *(base+88) = 0x0
>  *(base+96) = 0x0
>  *(base+104) = 0x0
>  *(base+112) = 0x0
>  *(base+120) = 0x0
> obj=0xc000448040 s.base()=0xc000448000 s.limit=0xc00044a000 s.spanclass=5 
> s.elemsize=16 s.state=mSpanInUse
>  *(obj+0) = 0x0
>  *(obj+8) = 0xc0004ee000
> fatal error: marking free object
>
> At this point i'm assuming the error has been done, and the trace is just 
> when it was realized to be wrong.
>
> What I do notice is that its not always at the same level when the error 
> is noticed:
> for the full stack trace, it was when the depth was 3, ...
> goroutine 1 [runnable]:runtime.newobject(0x464820, 0x2)
>     /go.batch/src/runtime/malloc.go:1067 +0x51 fp=0xc000084b20 sp=
> 0xc000084b18 pc=0x40a701
> main.bottomUpTree(0xffffffffffffbefb, 0x3, 0xc0008bb840)
>     /go.batch/batching/src/runnables/binarytree.go:33 +0x91 fp=
> 0xc000084b60 sp=0xc000084b20 pc=0x44f011
>
> ...and in another stack trace <https://pastebin.com/AukWCxAe>, it occured 
> when the depth was 1.
>
> I ran it some more times, and it always seemed to crash after batch 11, 
> and depth ranged from 0 to 3
>
> The stack trace for depth 0 <https://pastebin.com/ZMG02MM3> of goroutine 
> 1 started more interesting, where it did not trigger at `newobject`.
> goroutine 1 [GC assist marking]:
> runtime.systemstack_switch()
>     /go.batch/src/runtime/asm_amd64.s:311 fp=0xc000086930 sp=0xc000086928 
> pc=0x446d30
> runtime.gcAssistAlloc(0xc000000180)
>     /go.batch/src/runtime/mgcmark.go:422 +0x15c fp=0xc000086990 sp=
> 0xc000086930 pc=0x416e5c
> runtime.mallocgc(0x18, 0x464820, 0x1, 0x18)
>     /go.batch/src/runtime/malloc.go:843 +0x8e6 fp=0xc000086a30 sp=
> 0xc000086990 pc=0x40a456
> runtime.newobject(0x464820, 0xc00045e000)
>     /go.batch/src/runtime/malloc.go:1068 +0x38 fp=0xc000086a60 sp=
> 0xc000086a30 pc=0x40a6e8
> main.bottomUpTree(0xfffffffffffdec85, 0x0, 0x20)
>     /go.batch/batching/src/runnables/binarytree.go:29 +0xfc fp=
> 0xc000086aa0 sp=0xc000086a60 pc=0x44f07c
>
> I think at this point I may be overthinking it a bit, and my lack of 
> experience is more apparent.
> If there is something else I should be looking into, I am open to ideas.
>
> On Monday, 8 April 2019 19:42:56 UTC+2, Ian Lance Taylor wrote:
>>
>> On Sun, Apr 7, 2019 at 12:30 PM Tharen Abela <abela...@gmail.com> wrote: 
>> > 
>> > The gist of the problem is that I am allocating an object in the 
>> runtime, (which I refer to as the batch struct), and the GC is deallocating 
>> the object, even though a reference is being kept in a slice (similar to 
>> allp and allm). 
>> > While allocating, I call acquirem to prevent the GC being triggered, 
>> during which I append the batch pointer to the slice. 
>> > 
>> > From running `GODEBUG=gccheckmark=1` I know that the batch object 
>> allocated, was being freed, yet when it crashes it says the object is being 
>> marked (hence marking a freed object). 
>> > 
>> > Now my intention is to keep the batch allocation till the end of the 
>> program, keeping it in an extra batch queue, so it should not be freed. 
>> > 
>> > Thinking about it now, I am not sure if the deallocation occurs after 
>> the work of the program is finished and is winding down, by de-allocating 
>> everything, but a reference is still kept in allb, so a double free will 
>> occur, OR, 
>> > what I have been assuming so far, that this takes place while work is 
>> incomplete so the GC is incorrectly de-allocating a batch object still in 
>> use. 
>> > 
>> > Another thing to take note of, is that the batch in P is referenced by 
>> a uintptr, I'm not sure how that might affect it. 
>>
>> That is going to be your problem.  The GC only tracks values with live 
>> pointers.  A value of type `uintptr` can not be a live pointer.  The 
>> runtime can only get away with the `guintptr`, `muintptr` and 
>> `puintptr` types because it knows that there are existing other 
>> pointers to all G and P values (in the allgs and allp slices and the 
>> allm linked list).  If there is ever any moment that your batch 
>> objects are only referenced by `uintptr` values and not by a value of 
>> pointer type, then the garbage collector can collect it. 
>>
>> Ian 
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to