[ https://issues.apache.org/jira/browse/FLINK-16636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17100592#comment-17100592 ]
Caizhi Weng commented on FLINK-16636: ------------------------------------- Hi, After a few more investigations I'm afraid I have to conclude that this is not a bug. It's just that the memory size of our testing container is too small. I use the [native memory tracking tool|https://docs.oracle.com/javase/8/docs/technotes/guides/troubleshoot/tooldescr007.html] to track the native memory usage of all test cases, and I'll post the final memory usage below. Click [here|https://docs.oracle.com/javase/8/docs/technotes/guides/troubleshoot/tooldescr022.html] to see the explanation for each category. {code} RSS: 2928128 (in 1KB blocks) Total: reserved=4679436KB, committed=3198040KB - Java Heap (reserved=2097152KB, committed=1740800KB) (mmap: reserved=2097152KB, committed=1740800KB) - Class (reserved=1257953KB, committed=248449KB) (classes #29687) (malloc=25057KB #113855) (mmap: reserved=1232896KB, committed=223392KB) - Thread (reserved=56902KB, committed=56902KB) (thread #56) (stack: reserved=55456KB, committed=55456KB) (malloc=167KB #287) (arena=1279KB #110) - Code (reserved=279028KB, committed=180816KB) (malloc=29428KB #38259) (mmap: reserved=249600KB, committed=151388KB) - GC (reserved=139125KB, committed=125901KB) (malloc=28533KB #81265) (mmap: reserved=110592KB, committed=97368KB) - Compiler (reserved=179KB, committed=179KB) (malloc=48KB #444) (arena=131KB #3) - Internal (reserved=801672KB, committed=801664KB) (malloc=801632KB #83153) (mmap: reserved=40KB, committed=32KB) - Symbol (reserved=33730KB, committed=33730KB) (malloc=32059KB #274943) (arena=1670KB #1) - Native Memory Tracking (reserved=9283KB, committed=9283KB) (malloc=21KB #254) (tracking overhead=9262KB) - Arena Chunk (reserved=316KB, committed=316KB) (malloc=316KB) - Unknown (reserved=4096KB, committed=0KB) (mmap: reserved=4096KB, committed=0KB) {code} We see that besides heap memory, we have another 1GB+ native memory usage. What seems to be the most suspicious is the "Internal" memory which uses up to 800MB native memory, but I don't know what this "Internal" is (it's explained very roughly in the category documentation) and more detailed stack trace doesn't give me any information either. Besides, this "Internal" memory will drop from time to time to a small value, so I don't think there is a native memory leak here. We also have a somewhat large "Code" and "Class" memory usage but this is also normal, as we generate lots of Java code when running SQL. Note that besides the two surefire process, maven process and other process will also consume memory. So it just seems that we should enlarge the memory size of the container, or make the heap size limit smaller, or just to run these test cases with one single process. > TableEnvironmentITCase is crashing on Travis > -------------------------------------------- > > Key: FLINK-16636 > URL: https://issues.apache.org/jira/browse/FLINK-16636 > Project: Flink > Issue Type: Bug > Components: Table SQL / Planner > Affects Versions: 1.11.0 > Reporter: Jark Wu > Assignee: Caizhi Weng > Priority: Blocker > Labels: pull-request-available, test-stability > Fix For: 1.11.0 > > Time Spent: 20m > Remaining Estimate: 0h > > Here is the instance and exception stack: > https://api.travis-ci.org/v3/job/663408376/log.txt > But there is not too much helpful information there, maybe a accidental maven > problem. > {code} > 09:55:07.703 [ERROR] Failed to execute goal > org.apache.maven.plugins:maven-surefire-plugin:2.22.1:test > (integration-tests) on project flink-table-planner-blink_2.11: There are test > failures. > 09:55:07.703 [ERROR] > 09:55:07.703 [ERROR] Please refer to > /home/travis/build/apache/flink/flink-table/flink-table-planner-blink/target/surefire-reports > for the individual test results. > 09:55:07.703 [ERROR] Please refer to dump files (if any exist) [date].dump, > [date]-jvmRun[N].dump and [date].dumpstream. > 09:55:07.703 [ERROR] ExecutionException The forked VM terminated without > properly saying goodbye. VM crash or System.exit called? > 09:55:07.703 [ERROR] Command was /bin/sh -c cd > /home/travis/build/apache/flink/flink-table/flink-table-planner-blink/target > && /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java -Xms256m -Xmx2048m > -Dmvn.forkNumber=1 -XX:+UseG1GC -jar > /home/travis/build/apache/flink/flink-table/flink-table-planner-blink/target/surefire/surefirebooter714252487017838305.jar > > /home/travis/build/apache/flink/flink-table/flink-table-planner-blink/target/surefire > 2020-03-17T09-34-41_826-jvmRun1 surefire4625103637332937565tmp > surefire_43192129054983363633tmp > 09:55:07.703 [ERROR] Error occurred in starting fork, check output in log > 09:55:07.703 [ERROR] Process Exit Code: 137 > 09:55:07.703 [ERROR] Crashed tests: > 09:55:07.703 [ERROR] org.apache.flink.table.api.TableEnvironmentITCase > 09:55:07.703 [ERROR] > org.apache.maven.surefire.booter.SurefireBooterForkException: > ExecutionException The forked VM terminated without properly saying goodbye. > VM crash or System.exit called? > 09:55:07.703 [ERROR] Command was /bin/sh -c cd > /home/travis/build/apache/flink/flink-table/flink-table-planner-blink/target > && /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java -Xms256m -Xmx2048m > -Dmvn.forkNumber=1 -XX:+UseG1GC -jar > /home/travis/build/apache/flink/flink-table/flink-table-planner-blink/target/surefire/surefirebooter714252487017838305.jar > > /home/travis/build/apache/flink/flink-table/flink-table-planner-blink/target/surefire > 2020-03-17T09-34-41_826-jvmRun1 surefire4625103637332937565tmp > surefire_43192129054983363633tmp > 09:55:07.703 [ERROR] Error occurred in starting fork, check output in log > 09:55:07.703 [ERROR] Process Exit Code: 137 > 09:55:07.703 [ERROR] Crashed tests: > 09:55:07.703 [ERROR] org.apache.flink.table.api.TableEnvironmentITCase > 09:55:07.703 [ERROR] at > org.apache.maven.plugin.surefire.booterclient.ForkStarter.awaitResultsDone(ForkStarter.java:510) > 09:55:07.704 [ERROR] at > org.apache.maven.plugin.surefire.booterclient.ForkStarter.runSuitesForkOnceMultiple(ForkStarter.java:382) > 09:55:07.704 [ERROR] at > org.apache.maven.plugin.surefire.booterclient.ForkStarter.run(ForkStarter.java:297) > 09:55:07.704 [ERROR] at > org.apache.maven.plugin.surefire.booterclient.ForkStarter.run(ForkStarter.java:246) > 09:55:07.704 [ERROR] at > org.apache.maven.plugin.surefire.AbstractSurefireMojo.executeProvider(AbstractSurefireMojo.java:1183) > 09:55:07.704 [ERROR] at > org.apache.maven.plugin.surefire.AbstractSurefireMojo.executeAfterPreconditionsChecked(AbstractSurefireMojo.java:1011) > 09:55:07.704 [ERROR] at > org.apache.maven.plugin.surefire.AbstractSurefireMojo.execute(AbstractSurefireMojo.java:857) > 09:55:07.704 [ERROR] at > org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo(DefaultBuildPluginManager.java:132) > 09:55:07.704 [ERROR] at > org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:208) > 09:55:07.704 [ERROR] at > org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:153) > 09:55:07.704 [ERROR] at > org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:145) > 09:55:07.704 [ERROR] at > org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:116) > 09:55:07.704 [ERROR] at > org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:80) > 09:55:07.704 [ERROR] at > org.apache.maven.lifecycle.internal.builder.singlethreaded.SingleThreadedBuilder.build(SingleThreadedBuilder.java:51) > 09:55:07.704 [ERROR] at > org.apache.maven.lifecycle.internal.LifecycleStarter.execute(LifecycleStarter.java:120) > 09:55:07.704 [ERROR] at > org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:355) > 09:55:07.704 [ERROR] at > org.apache.maven.DefaultMaven.execute(DefaultMaven.java:155) > 09:55:07.704 [ERROR] at > org.apache.maven.cli.MavenCli.execute(MavenCli.java:584) > 09:55:07.704 [ERROR] at > org.apache.maven.cli.MavenCli.doMain(MavenCli.java:216) > 09:55:07.704 [ERROR] at org.apache.maven.cli.MavenCli.main(MavenCli.java:160) > 09:55:07.704 [ERROR] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native > Method) > 09:55:07.704 [ERROR] at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > 09:55:07.704 [ERROR] at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > 09:55:07.704 [ERROR] at java.lang.reflect.Method.invoke(Method.java:498) > 09:55:07.704 [ERROR] at > org.codehaus.plexus.classworlds.launcher.Launcher.launchEnhanced(Launcher.java:289) > 09:55:07.704 [ERROR] at > org.codehaus.plexus.classworlds.launcher.Launcher.launch(Launcher.java:229) > 09:55:07.704 [ERROR] at > org.codehaus.plexus.classworlds.launcher.Launcher.mainWithExitCode(Launcher.java:415) > 09:55:07.704 [ERROR] at > org.codehaus.plexus.classworlds.launcher.Launcher.main(Launcher.java:356) > 09:55:07.704 [ERROR] Caused by: > org.apache.maven.surefire.booter.SurefireBooterForkException: The forked VM > terminated without properly saying goodbye. VM crash or System.exit called? > 09:55:07.704 [ERROR] Command was /bin/sh -c cd > /home/travis/build/apache/flink/flink-table/flink-table-planner-blink/target > && /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java -Xms256m -Xmx2048m > -Dmvn.forkNumber=1 -XX:+UseG1GC -jar > /home/travis/build/apache/flink/flink-table/flink-table-planner-blink/target/surefire/surefirebooter714252487017838305.jar > > /home/travis/build/apache/flink/flink-table/flink-table-planner-blink/target/surefire > 2020-03-17T09-34-41_826-jvmRun1 surefire4625103637332937565tmp > surefire_43192129054983363633tmp > 09:55:07.704 [ERROR] Error occurred in starting fork, check output in log > 09:55:07.704 [ERROR] Process Exit Code: 137 > 09:55:07.704 [ERROR] Crashed tests: > 09:55:07.704 [ERROR] org.apache.flink.table.api.TableEnvironmentITCase > 09:55:07.704 [ERROR] at > org.apache.maven.plugin.surefire.booterclient.ForkStarter.fork(ForkStarter.java:669) > 09:55:07.704 [ERROR] at > org.apache.maven.plugin.surefire.booterclient.ForkStarter.access$600(ForkStarter.java:115) > 09:55:07.704 [ERROR] at > org.apache.maven.plugin.surefire.booterclient.ForkStarter$1.call(ForkStarter.java:371) > 09:55:07.704 [ERROR] at > org.apache.maven.plugin.surefire.booterclient.ForkStarter$1.call(ForkStarter.java:347) > 09:55:07.704 [ERROR] at > java.util.concurrent.FutureTask.run(FutureTask.java:266) > 09:55:07.704 [ERROR] at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > 09:55:07.704 [ERROR] at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > 09:55:07.704 [ERROR] at java.lang.Thread.run(Thread.java:748) > 09:55:07.704 [ERROR] -> [Help 1] > 09:55:07.704 [ERROR] > 09:55:07.704 [ERROR] To see the full stack trace of the errors, re-run Maven > with the -e switch. > 09:55:07.704 [ERROR] Re-run Maven using the -X switch to enable full debug > logging. > 09:55:07.704 [ERROR] > 09:55:07.704 [ERROR] For more information about the errors and possible > solutions, please read the following articles: > 09:55:07.704 [ERROR] [Help 1] > http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException > 09:55:07.704 [ERROR] > 09:55:07.704 [ERROR] After correcting the problems, you can resume the build > with the command > 09:55:07.704 [ERROR] mvn <goals> -rf :flink-table-planner-blink_2.11 > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)