ConfX created HADOOP-18822: ------------------------------ Summary: Out of Memory when mistakenly set decay-scheduler.metrics.top.user.count to a large number Key: HADOOP-18822 URL: https://issues.apache.org/jira/browse/HADOOP-18822 Project: Hadoop Common Issue Type: Bug Reporter: ConfX Attachments: reproduce.sh
h2. What happened: When setting {{decay-scheduler.metrics.top.user.count}} to a large number, {{DecayRpcScheduler}} in Hcommon throws an out-of-memory exception due to inappropriate checking and handling. Hcommon only checks the value should be larger than 0. h2. Buggy code: In DecayRpcScheduler.java {noformat} public DecayRpcScheduler(int numLevels, String ns, Configuration conf) { ... topUsersCount = conf.getInt(DECAYSCHEDULER_METRICS_TOP_USER_COUNT, DECAYSCHEDULER_METRICS_TOP_USER_COUNT_DEFAULT); <<---- topUsersCount gets the config value Preconditions.checkArgument(topUsersCount > 0, <<--- Only checks for positivity "the number of top users for scheduler metrics must be at least 1"); ... } private void addTopNCallerSummary(MetricsRecordBuilder rb) { TopN topNCallers = getTopCallers(topUsersCount); <<--- calls getTopCallers with n equals topUsersCount ... } private TopN getTopCallers(int n) { TopN topNCallers = new TopN(n); <<--- starts an priorityQ with initial capacity n, causing out of memory ... }{noformat} h2. StackTrace: {noformat} java.lang.OutOfMemoryError: Java heap space at java.base/java.util.PriorityQueue.<init>(PriorityQueue.java:172) at java.base/java.util.PriorityQueue.<init>(PriorityQueue.java:139) at org.apache.hadoop.metrics2.util.Metrics2Util$TopN.<init>(Metrics2Util.java:80) at org.apache.hadoop.ipc.DecayRpcScheduler.getTopCallers(DecayRpcScheduler.java:1002) at org.apache.hadoop.ipc.DecayRpcScheduler.addTopNCallerSummary(DecayRpcScheduler.java:982) at org.apache.hadoop.ipc.DecayRpcScheduler.getMetrics(DecayRpcScheduler.java:935) at org.apache.hadoop.ipc.DecayRpcScheduler$MetricsProxy.getMetrics(DecayRpcScheduler.java:893) at org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.getMetrics(MetricsSourceAdapter.java:200) at org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.updateJmxCache(MetricsSourceAdapter.java:183) at org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.getMBeanInfo(MetricsSourceAdapter.java:156) at java.management/com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getNewMBeanClassName(DefaultMB eanServerInterceptor.java:329) at java.management/com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerMBean(DefaultMBeanServ erInterceptor.java:315) at java.management/com.sun.jmx.mbeanserver.JmxMBeanServer.registerMBean(JmxMBeanServer.java:522) at org.apache.hadoop.metrics2.util.MBeans.register(MBeans.java:100) at org.apache.hadoop.metrics2.util.MBeans.register(MBeans.java:73) at org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.startMBeans(MetricsSourceAdapter.java:222) at org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.start(MetricsSourceAdapter.java:101) at org.apache.hadoop.metrics2.impl.MetricsSystemImpl.registerSource(MetricsSystemImpl.java:268) at org.apache.hadoop.metrics2.impl.MetricsSystemImpl.register(MetricsSystemImpl.java:233) at org.apache.hadoop.ipc.DecayRpcScheduler$MetricsProxy.registerMetrics2Source(DecayRpcScheduler.java:8 19) at org.apache.hadoop.ipc.DecayRpcScheduler$MetricsProxy.<init>(DecayRpcScheduler.java:792) at org.apache.hadoop.ipc.DecayRpcScheduler$MetricsProxy.getInstance(DecayRpcScheduler.java:800) at org.apache.hadoop.ipc.DecayRpcScheduler.<init>(DecayRpcScheduler.java:260){noformat} h2. Reproduce: (1) Set {{decay-scheduler.metrics.top.user.count}} to a large value, e.g., 1419140791 (2) Run a simple test that exercises this parameter, e.g. {{org.apache.hadoop.ipc.TestDecayRpcScheduler#testNPEatInitialization}} For an easy reproduction, run the reproduce.sh in the attachment. We are happy to provide a patch if this issue is confirmed. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org