This seems to be related to https://github.com/golang/go/issues/31222 does runtime.mallocgc blocks the pre-emption in go1.18 ? We see long sweep termination phase. Allocation size is 32K - 64K. Trace Image Link: https://github.com/NeetishPathak/GolangDebug/blob/main/gc_pause.png
On Tuesday, 24 January 2023 at 05:34:31 UTC-8 Neetish Pathak wrote: > Hi All, > > Golang version 1.18.2 > > Have a kvstore implemented in golang. There is a proxy layer that > communicates with storage that uses CGo for data access from storage > engine. Both proxy and storage are implemented in golang. I observe that > there are occasional spikes in the latency. (< 0.01% of client request > timeout due to this). > > This is affecting the tail latency performance. Traces on both proxy and > storage show that there are high latency in STW (mark setup) phase. > Sometimes , it is higher than 100ms as highlighted in shared trace. Need > advice to debug further and understand the potential reasons. > > In my test setup, > > GCP VM: vCPu 64 , 500GB RAM > > Linux, Ubuntu 16.04 > > Go version: 1.18.2 > > with high payloads request, high STW are happening often. > > With low payloads, this behavior is rare. > > > I do not have much experience with golang internals . Curious to learn. > Appreciate your help. > > > Sharing traces for high payload tests > > gc 34757 @5898.241s 1%: *11+2.0+0.20 ms* clock, 747+1.2/13/0.008+12 ms > cpu, 42->44->gc 2135135 MB, @5898.23042s MB goal, 10%: MB stacks, > 21+02.2 MB globals, +0.07964 ms clock, P1398 > > gc 35136 @5898.365s 1%: *54+2.1+0.16* ms clock, 3507+0.13/13/0+10 ms cpu, > 44->45->22 MB, 48 MB goal, 0 MB stacks, 0 MB globals, 64 P > > gc 32914 @5898.272s 1%: *147+2.3+0.19 *ms clock, 9465+0.27/15/0.008+12 ms > cpu, 44->47->24 MB, 47 MB goal, 0 MB stacks, 0 MB globals, 64 P > > gc 34758 @5898.411s 1%: *9.4+2.6+0.20 ms* clock, 606+0.26/11/0+12 ms cpu, > 41->43->21 MB, 43 MB goal, 0 MB stacks, 0 MB globals, 64 P > > gc 35137 @5898.488s 1%: 0.22+1.5+0.13 ms clock, 14+0.22/12/0.17+8.8 ms > cpu, 39->41->23 MB, 46 MB goal, 0 MB stacks, 0 MB globals, 64 P > > gc 35079 @5898.554s 1%: 0.11+1.4+0.18 ms clock, 7.2+0/11/0.22+11 ms cpu, > 47->48->22 MB, 48 MB goal, 0 MB stacks, 0 MB globals, 64 P > > gc 34759 @5898.574s 1%: 0.19+1.6+0.084 ms clock, 12+0.27/12/0.49+5.3 ms > cpu, 44->44->21 MB, 44 MB goal, 0 MB stacks, 0 MB globals, 64 P > > gc 35138 @5898.580s 1%: 0.24+1.6+0.15 ms clock, 15+0.81/12/0.36+9.6 ms > cpu, 43->44->23 MB, 47 MB goal, 0 MB stacks, 0 MB globals, 64 P > > gc 32915 @5898.680s 1%: 0.12+1.7+0.099 ms clock, 8.2+0.35/12/0.093+6.3 ms > cpu, 47->49->23 MB, 50 MB goal, 0 MB stacks, 0 MB globals, 64 P > > gc 35139 @5898.688s 1%: 0.30+1.6+0.21 ms clock, 19+0.15/11/0.10+14 ms cpu, > 39->41->23 MB, 47 MB goal, 0 MB stacks, 0 MB globals, 64 P > > gc 35080 @5898.780s 1%: 0.10+1.8+0.14 ms clock, 7.0+0.64/13/0.60+9.2 ms > cpu, 45->46->22 MB, 46 MB goal, 0 MB stacks, 0 MB globals, 64 P > > gc 34760 @5898.838s 1%: *124+1.9+0.14* ms clock, 7955+0.59/12/0+9.5 ms > cpu, 42->43->21 MB, 43 MB goal, 0 MB stacks, 0 MB globals, 64 P > > gc 35081 @5898.894pacer: s 125%: 69% CPU (+252.2 exp.) for +0.1415557280 > ms clock, +4420571392+0.10+/32993615/ B work (0.09816468824+ B exp.) 9.4gc > in ms cpu, 444600279332916-> B -> 46 @48641064->5898.81823 B (∆goal s MB, > 1-45%: MB goal, 6339921440 MB stacks, +, cons/mark 01.7+6.634698e-002+ MB > globals, )0.4164 P ms clock, > > gc 35140 @5898.821s 1%: *141+2.6+0.25 ms* clock, 9049+0.061/14/0+16 ms > cpu, 43->46->24 MB, 46 MB goal, 0 MB stacks, 0 MB globals, 64 P > > gc 35082 @5899.017s 1%: 0.18+1.9+0.10 ms clock, 11+1.9/12/0.29+6.9 ms cpu, > 47->49->23 MB, 48 MB goal, 0 MB stacks, 0 MB globals, 64 P > > gc 32917 @5899.058s 1%: 0.21+1.5+0.096 ms clock, 13+0.59/12/0.33+6.1 ms > cpu, 45->46->23 MB, 47 MB goal, 0 MB stacks, 0 MB globals, 64 P > > gc 35141 @5899.062s 1%: 0.20+1.5+0.11 ms clock, 12+0.26/14/0.31+7.4 ms > cpu, 47->49->23 MB, 48 MB goal, 0 MB stacks, 0 MB globals, 64 P > > gc 35083 @5899.138s 1%: 0.14+1.3+0.048 ms clock, 9.1+0.51/11/0.22+3.0 ms > cpu, 45->46->22 MB, 47 MB goal, 0 MB stacks, 0 MB globals, 64 P > > gc 34761 @5899.160s 1%: 0.18+1.7+0.053 ms clock, 11+0.32/9.4/2.1+3.4 ms > cpu, 42->43->21 MB, 43 MB goal, 0 MB stacks, 0 MB globals, 64 P > > gc 32918 @5899.187s 1%: 0.26+1.6+0.087 ms clock, 16+0.55/11/0.19+5.6 ms > cpu, 46->47->23 MB, 47 MB goal, 0 MB stacks, 0 MB globals, 64 P > > gc 35142 @5899.206s 1%: 0.21+1.3+0.063 ms clock, 13+0.80/11/0.14+4.0 ms > cpu, 45->46->22 MB, 48 MB goal, 0 MB stacks, 0 MB globals, 64 P > > gc 32919 @5899.322s 1%: 0.11+1.3+0.080 ms clock, 7.4+0.39/11/0.34+5.1 ms > cpu, 45->46->23 MB, 47 MB goal, 0 MB stacks, 0 MB globals, 64 P > > gc 35143 @5899.352s 1%: 0.20+1.8+0.11 ms clock, 13+0.30/12/0.63+7.6 ms > cpu, 45->47->23 MB, 46 MB goal, 0 MB stacks, 0 MB globals, 64 P > > gc 35084 @5899.368s 1%: 0.21+1.5+0.094 ms clock, 13+1.7/11/0.20+6.0 ms > cpu, 45->47->23 MB, 46 MB goal, 0 MB stacks, 0 MB globals, 64 P > > > Full_Trace_high_payload: > https://raw.githubusercontent.com/NeetishPathak/GolangDebug/main/0110_proxy_gc_high_payload > > Full_Trace_low_payload: > https://raw.githubusercontent.com/NeetishPathak/GolangDebug/main/0122_proxy_gc_low_payload > > I cannot share the code as it is proprietary. > > Thanks > -- You received this message because you are subscribed to the Google Groups "golang-nuts" group. To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/7d8b5e7a-186a-416b-b2d1-97814fa35c18n%40googlegroups.com.