Hello,
I was wondering if anyone has successfully done or might know how to run Spark unit tests without recompiling every time. Due to some limitations of our test systems and behavior on BigEndian / EBCDIC encoded systems we are forced to run unit tests in a few phases. Currently we have the following pipeline: Build Spark - 45-60 min Initial Unit Testing - 2 hrs Missing Unit Tests - 3 hrs Failed Unit Tests - 1 hrs Obviously Building Spark requires compiling everything, but then we have the problem in each consecutive stage having to recompile everything. The following is occurring in each stage: Build Spark Build Spark without unit tests and create a package via make-distribution. mvn -e -Dhive -Dhive-thriftserver -Dhadoop-2.10 -DskipTests clean package Initial Unit Testing Running unit tests. mvn -e -fn -Dhive -Dhive-thriftserver -Dhadoop-2.10 test Missing Unit Testing Compare executed tests (determined by surefire-reports) from available tests. Tests are missed when a unit test causes a JVM error possibly due to an OOM error, remaining unit tests of project are skipped. These tests are then collectively run against a specific project. i.e. any missing scalaTests under core are run. mvn -pl external/flume -am -e -fn -Phive -Phive-thriftserver -Phadoop-2.10 -DwildcardSuites=none -Dtest=org.apache.spark.streaming.flume.JavaFlumePollingStreamSuite,org.apache.spark.streaming.flume.JavaFlumeStreamSuite test Failed Unit Testing Look at the results of the tests (determined by surefire-reports) re-run any failing tests by themselves, this resolved a large number of tests that are flaky tests. mvn -pl core -am -e -fn -Phive -Phive-thriftserver -Phadoop-2.10 -DwildcardSuites=org.apache.spark.util.UtilsSuite -Dtest=none test Above with every run of mvn everything is recompiled, even though the code hasn't changed. I'd like to compile the tests once during the Build Spark stage, and simply run the tests in the Unit Testing stages. This would speed up our pipeline drastically. Any suggestions are appreciated. Additional information: Java Options set to: -Dfile.encoding=UTF8 -Xmx4g -Xss1024k -Dconsole.encoding=IBM-1047 -XX:MaxPermSize=512m -XX:ReservedCodeCacheSize=512m; Have updated Maven through the years, but unaware of any new features that'd help: 3.3.9, 3.5.4, 3.6.3, and currently 3.8.1 We attempted multi-core compilation years ago, to no avail. (but willing to try again if it is suggested.) Zinc was also attempted years ago, but wasn't able to port it over to our system at the time. Thanks again! Sincerely, Nicholas T. Marion AI and Analytics Development Lead | IzODA CPO Mobile: 1 845 649 3592 E-mail: nmar...@us.ibm.com IBM