Hi,

Ufuk had write up an excellent document about Netty's memory allocation [1]
inside Flink, and i want to add one more note after running some large
scale jobs.

The only inaccurate thing about [1] is how much memory will
LengthFieldBasedFrameDecoder
use. From our observations, it will cost at most 4M for each physical
connection.

Why(tl;dr): the reason is ByteToMessageDecoder which is the base class
of LengthFieldBasedFrameDecoder
used a Cumulator to save the bytes for further decoding. The Cumulator will
try to discard some read bytes to make room in the buffer
when channelReadComplete is triggered. In most cases, channelReadComplete
will only be triggered by AbstractNioByteChannel after which has read
"maxMessagesPerRead" times. The default value for maxMessagesPerRead is 16.
So in worst case, the Cumulator will write up to 1M (64K * 16) data. And
due to the logic of ByteBuf's discardSomeReadBytes, the Cumulator will
expand to 4M.

We add an option to tune the maxMessagesPerRead, set it to 2 and everything
works fine. I know Stephan is working on network improvements, it will be a
good choice to replace the whole netty pipeline with Flink's own
implementation. But I think we will face some similar logics when
implementing, careful about this.

BTW, should we open a jira to add this config?


[1]
https://cwiki.apache.org/confluence/display/FLINK/Netty+memory+allocation

Reply via email to