[
https://issues.apache.org/jira/browse/HBASE-19320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16269052#comment-16269052
]
huaxiang sun commented on HBASE-19320:
--------------------------------------
Thanks [~anoop.hbase] for confirm.
Hi [~ashishujjain], thanks for the comments and the logs. Yes, the same log
pattern happens in our cluster as well. The direct memory leak actually happens
in the replication sink at the dst region server who sends batch to the target
region server at the dst cluster. Once Direct memory leak happens, all
replication handler will get stuck as the log you posted and then call queue
fills up. Depends on the wal pattern, in our case, there are over 3G DM
assigned for (short circuit read + NIO), still, over the time, all DMs got
consumed and resulted in the replication block.
We are experimenting the solutions and will report back.
> document the mysterious direct memory leak in hbase
> ----------------------------------------------------
>
> Key: HBASE-19320
> URL: https://issues.apache.org/jira/browse/HBASE-19320
> Project: HBase
> Issue Type: Improvement
> Affects Versions: 2.0.0, 1.2.6
> Reporter: huaxiang sun
> Assignee: huaxiang sun
> Attachments: Screen Shot 2017-11-21 at 4.43.36 PM.png, Screen Shot
> 2017-11-21 at 4.44.22 PM.png
>
>
> Recently we run into a direct memory leak case, which takes some time to
> trace and debug. Internally discussed with our [[email protected]], we
> thought we had some findings and want to share with the community.
> Basically, it is the issue described in
> http://www.evanjones.ca/java-bytebuffer-leak.html and it happened to one of
> our hbase clusters.
> Create the jira first and will fill in more details later.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)