Re: [go-nuts] Re: Pod memory keeps on increasing and restart with error OOMKilled

Rakesh K R Thu, 10 Mar 2022 09:03:19 -0800

Hi,
Sorry I am not sure which kafka configuration are you referring here. Can 
you please point me to the right configuration responsible for retaining 
the message for replay.
I see following properties which might be related but not sure:
queued.min.messages
queued.max.messages.kbytes
queue.buffering.max.messages
queue.buffering.max.kbytes
linger.ms ---> this is currently set to 1000
message.timeout.ms


Thank you

On Thursday, March 10, 2022 at 9:50:50 PM UTC+5:30 ren...@ix.netcom.com 
wrote:

> You need to configure Kafka for how long it retains messages for replay - 
> or some other option to store on disk. 
>
> On Mar 10, 2022, at 10:07 AM, Rakesh K R <rakesh...@gmail.com> wrote:
>
> Tamas,
>
> Thanks you. So any suggestion on how to make application release this 
> 900MiB memory back to OS so that pod will not end up in OOMKilled state?
>
> On Thursday, March 10, 2022 at 1:45:18 PM UTC+5:30 Tamás Gulácsi wrote:
>
>> gopkg.in/confluentinc/confluent-kafka-go.v1/kafka._Cfunc_GoBytes
>>
>> says it uses cgo, hiding it's memory usage from Go. I bet that 900MiB of 
>> memory is there...
>>
>>
>> Rakesh K R a következőt írta (2022. március 10., csütörtök, 7:26:57 
>> UTC+1):
>>
>>> HI,
>>> I have a micro service application deployed on kubernetes cluster(with 
>>> 1gb of pod memory limit). This app receive continuous messages(500 message 
>>> per second) from producer app over kafka interface(these messages are 
>>> encoded in protobuf format.)
>>>
>>> *Basic application flow:*
>>> 1. Get the message one by one from kafka
>>> 2. unmarshal proto message
>>> 3. apply business logic
>>> 4. write the message to redis cache(in []byte format)
>>>
>>> When pod starts memory will be around 50mb and memory starts increasing 
>>> as traffic flows into the application. It is never released back to OS. As 
>>> a result pod restarts with error code *OOMKilled*.
>>> I have integrated grafana to see memory usage like RSS, heap, stack.
>>> During this traffic flow, in-use heap size is 80mb, idle heap is 80mb 
>>> where as process resident memory is at 800-1000MB. Stopping the traffic 
>>> completely for hours did not help and RSS continue to remain in 1000mb.
>>> Tried to analyze this with pprof and it reports only 80mb are in in-use 
>>> section. So I am wondering where these remaining 800-1000mb of pods memory 
>>> went. Also application allocates memory like slices/maps/strings to perform 
>>> business logic(see alloc_space pprof output below)
>>>
>>> I tried couple of experiments:
>>> 1. Calling FreeOsMemory() in the app but that did not help
>>> 2. invoking my app with GODEBUG=madvdontneed=1 my_app_executable and 
>>> did not help
>>> 3. Leaving the application for 5-6hrs without any traffic to see whether 
>>> memory comes down. It did not help
>>> 4. pprof shows only 80mb of heap in use
>>> 5. Tried upgrading golang version from 1.13 to 1.16 as there were some 
>>> improvements in runtime. It did not help
>>>
>>> pprof output for *alloc_space*:
>>>
>>> (pprof) top20
>>> Showing nodes accounting for 481.98GB, 91.57% of 526.37GB total
>>> Dropped 566 nodes (cum <= 2.63GB)
>>> Showing top 20 nodes out of 114
>>>       flat  flat%   sum%        cum   cum%
>>>    78.89GB 14.99% 14.99%    78.89GB 14.99%  
>>> github.com/go-redis/redis/v7/internal/proto.(*Reader).readStringReply
>>>    67.01GB 12.73% 27.72%   285.33GB 54.21% 
>>>  airgroup/internal/wrapper/agrediswrapper.GetAllConfigurationForGroups
>>>    58.75GB 11.16% 38.88%    58.75GB 11.16%  
>>> google.golang.org/protobuf/internal/impl.(*MessageInfo).MessageOf
>>>    52.26GB  9.93% 48.81%    52.26GB  9.93%  reflect.unsafe_NewArray
>>>    45.78GB  8.70% 57.50%    46.38GB  8.81% 
>>>  encoding/json.(*decodeState).literalStore
>>>    36.98GB  7.02% 64.53%    36.98GB  7.02%  reflect.New
>>>    28.20GB  5.36% 69.89%    28.20GB  5.36%  
>>> gopkg.in/confluentinc/confluent-kafka-go.v1/kafka._Cfunc_GoBytes
>>>    25.60GB  4.86% 74.75%    63.62GB 12.09%  
>>> google.golang.org/protobuf/proto.MarshalOptions.marshal
>>>    12.79GB  2.43% 77.18%   165.56GB 31.45% 
>>>  encoding/json.(*decodeState).object
>>>    12.73GB  2.42% 79.60%    12.73GB  2.42%  reflect.mapassign
>>>    11.05GB  2.10% 81.70%    63.31GB 12.03%  reflect.MakeSlice
>>>    10.06GB  1.91% 83.61%    12.36GB  2.35% 
>>>  filterServersForDestinationDevicesAndSendToDistributionChan
>>>     6.92GB  1.32% 84.92%   309.45GB 58.79% 
>>>  groupAndSendToConfigPolicyChannel
>>>     6.79GB  1.29% 86.21%    48.85GB  9.28% 
>>>  publishInternalMsgToDistributionService
>>>     6.79GB  1.29% 87.50%   174.81GB 33.21%  encoding/json.Unmarshal
>>>     6.14GB  1.17% 88.67%     6.14GB  1.17%  
>>> google.golang.org/protobuf/internal/impl.consumeBytes
>>>     4.64GB  0.88% 89.55%    14.39GB  2.73% 
>>>  GetAllDevDataFromGlobalDevDataDb
>>>     4.11GB  0.78% 90.33%    18.47GB  3.51% 
>>>  GetAllServersFromServerRecordDb
>>>     3.27GB  0.62% 90.95%     3.27GB  0.62%  net.HardwareAddr.String
>>>     3.23GB  0.61% 91.57%     3.23GB  0.61%  reflect.makemap
>>> (pprof)
>>>
>>>
>>> Need experts help in analyzing this issue.
>>>
>>> Thanks in advance!!
>>>
>> -- 
> You received this message because you are subscribed to the Google Groups 
> "golang-nuts" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to golang-nuts...@googlegroups.com.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/golang-nuts/e9b91937-7bf5-4526-940f-2a60b2989ddfn%40googlegroups.com
>  
> <https://groups.google.com/d/msgid/golang-nuts/e9b91937-7bf5-4526-940f-2a60b2989ddfn%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/golang-nuts/0b7a3c46-00f6-43ff-9db4-b0520282610an%40googlegroups.com.

Re: [go-nuts] Re: Pod memory keeps on increasing and restart with error OOMKilled

Reply via email to