Maybe it seems about KNN test case which is merged into yesterday. I’ll look into ML test.
Regards, Chiwan Park > On May 31, 2016, at 5:38 PM, Ufuk Celebi <u...@apache.org> wrote: > > Currently, an ML test is reliably failing and occasionally some HA > tests. Is someone looking into the ML test? > > For HA, I will revert a commit, which might cause the HA > instabilities. Till is working on a proper fix as far as I know. > > On Tue, May 31, 2016 at 3:50 AM, Chiwan Park <chiwanp...@apache.org> wrote: >> Thanks for the great work! :-) >> >> Regards, >> Chiwan Park >> >>> On May 31, 2016, at 7:47 AM, Flavio Pompermaier <pomperma...@okkam.it> >>> wrote: >>> >>> Awesome work guys! >>> And even more thanks for the detailed report...This troubleshooting summary >>> will be undoubtedly useful for all our maven projects! >>> >>> Best, >>> Flavio >>> On 30 May 2016 23:47, "Ufuk Celebi" <u...@apache.org> wrote: >>> >>>> Thanks for the effort, Max and Stephan! Happy to see the green light again. >>>> >>>> On Mon, May 30, 2016 at 11:03 PM, Stephan Ewen <se...@apache.org> wrote: >>>>> Hi all! >>>>> >>>>> After a few weeks of terrible build issues, I am happy to announce that >>>> the >>>>> build works again properly, and we actually get meaningful CI results. >>>>> >>>>> Here is a story in many acts, from builds deep red to bright green joy. >>>>> Kudos to Max, who did most of this troubleshooting. This evening, Max and >>>>> me debugged the final issue and got the build back on track. >>>>> >>>>> ------------------ >>>>> The Journey >>>>> ------------------ >>>>> >>>>> (1) Failsafe Plugin >>>>> >>>>> The Maven Failsafe Build Plugin had a critical bug due to which failed >>>>> tests did not result in a failed build. >>>>> >>>>> That is a pretty bad bug for a plugin whose only task is to run tests and >>>>> fail the build if a test fails. >>>>> >>>>> After we recognized that, we upgraded the Failsafe Plugin. >>>>> >>>>> >>>>> (2) Failsafe Plugin Dependency Issues >>>>> >>>>> After the upgrade, the Failsafe Plugin behaved differently and did not >>>>> interoperate with Dependency Shading any more. >>>>> >>>>> Because of that, we switched to the Surefire Plugin. >>>>> >>>>> >>>>> (3) Fixing all the issues introduced in the meantime >>>>> >>>>> Naturally, a number of test instabilities had been introduced, which >>>> needed >>>>> to be fixed. >>>>> >>>>> >>>>> (4) Yarn Tests and Test Scope Refactoring >>>>> >>>>> In the meantime, a Pull Request was merged that moved the Yarn Tests to >>>> the >>>>> test scope. >>>>> Because the configuration searched for tests in the "main" scope, no Yarn >>>>> tests were executed for a while, until the scope was fixed. >>>>> >>>>> >>>>> (5) Yarn Tests and JMX Metrics >>>>> >>>>> After the Yarn Tests were re-activated, we saw them fail due to warnings >>>>> created by the newly introduced metrics code. We could fix that by >>>> updating >>>>> the metrics code and temporarily not registering JMX beans for all >>>> metrics. >>>>> >>>>> >>>>> (6) Yarn / Surefire Deadlock >>>>> >>>>> Finally, some Yarn tests failed reliably in Maven (though not in the >>>> IDE). >>>>> It turned out that those test a command line interface that interacts >>>> with >>>>> the standard input stream. >>>>> >>>>> The newly deployed Surefire Plugin uses standard input as well, for >>>>> communication with forked JVMs. Since Surefire internally locks the >>>>> standard input stream, the Yarn CLI cannot poll the standard input stream >>>>> without locking up and stalling the tests. >>>>> >>>>> We adjusted the tests and now the build happily builds again. >>>>> >>>>> ----------------- >>>>> Conclusions: >>>>> ----------------- >>>>> >>>>> - CI is terribly crucial It took us weeks with the fallout of having a >>>>> period of unreliably CI. >>>>> >>>>> - Maven could do a better job. A bug as crucial as the one that started >>>>> our problem should not occur in a test plugin like surefire. Also, the >>>>> constant change of semantics and dependency scopes is annoying. The >>>>> semantic changes are subtle, but for a build as complex as Flink, they >>>> make >>>>> a difference. >>>>> >>>>> - File-based communication is rarely a good idea. The bug in the >>>> failsafe >>>>> plugin was caused by improper file-based communication, and some of our >>>>> discovered instabilities as well. >>>>> >>>>> Greetings, >>>>> Stephan >>>>> >>>>> >>>>> PS: Some issues and mysteries remain for us to solve: When we allow our >>>>> metrics subsystem to register JMX beans, we see some tests failing due to >>>>> spontaneous JVM process kills. Whoever has a pointer there, please ping >>>> us! >>>> >>