Hi Ken, regarding the failed tests: - cascading.JoinFieldedPipesPlatformTest$testJoinMergeGroupBy is expected to fail due to restrictions in the MR/Tez engines. If I remember correctly, this is about deadlocks that need to be resolved by splitting a job. Flink's optimizer detects such situations and places a dam breaker to resolve such a situation within a single job and is hence able to execute the job correctly. - cascading.ComparePlatformsTest$CompareTestCase I think you are right on this one. When I implemented the runner, I did not find a way to make this tests pass. It looked like an issue with the test itself as you assumed as well.
Btw. I ported the runner to Flink 1.0 and bumped the Cascading 3.1 WIP version already, but haven't done an "official" release yet. You find the code in the flink-1.0 branch [1]. With Flink 1.0, we also extended the support for outer joins. It might be possible to get rid of some of the HashJoin restrictions, but I have to take a closer look at how outer hash joins are done with Cascading MR/Tez. Anyway, I can do a Cascading-Flink release for Flink 1.0 soon and extend HashJoin support later. Best, Fabian [1] https://github.com/dataartisans/cascading-flink/tree/flink-1.0 2016-03-30 6:08 GMT+02:00 Ken Krugler <kkrugler_li...@transpac.com>: > Hi Fabian, > > > From: Fabian Hueske > > Sent: March 29, 2016 3:51:08pm PDT > > To: dev@flink.apache.org > > Subject: Re: Expected duration for cascading-flink tests? > > > > Hi Ken, > > > > no, this is definitely not expected. The tests complete in about 30 mins > on > > my machine. > > Is it possible that you have another Flink process running on your > machine > > (maybe a debug thread in your IDE)? That could explain the "Address > already > > in use" exceptions. > > Good call - I'd run "bin/stop-local.sh" previously, but I see that there's > still the Flink process running. > > Re-running bin/stop-local.sh displays "No jobmanager daemon to stop on > host Kens-MacBook-Air.local.", but still doesn't kill off the Flink process. > > What might cause that situation? > > In any case, I manually killed the process and started the build again, > and it finished in about 20 minutes, which is great. > > I see the expected errors, e.g. > > HashJoin does only support InnerJoin and LeftJoin but is > cascading.pipe.joiner.OuterJoin > > though this one seems odd: > > > testJoinMergeGroupBy(cascading.JoinFieldedPipesPlatformTest) Time > elapsed: 0.048 sec <<< FAILURE! > > junit.framework.AssertionFailedError: planner should throw error on plan > > FlinkTestPlatform needs to return true from supportsGroupByAfterMerge() - > assuming that this is actually the case (seems reasonable for Flink) > > Though making that change requires cascading-wip-56 to avoid a compilation > error on the @Override. > > There's also this one: > > > Running cascading.ComparePlatformsTest$CompareTestCase > > Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 0.053 > sec <<< FAILURE! - in cascading.ComparePlatformsTest$CompareTestCase > > warning(junit.framework.TestSuite$1) Time elapsed: 0.009 sec <<< > FAILURE! > > junit.framework.AssertionFailedError: Class > cascading.ComparePlatformsTest$CompareTestCase has no public constructor > TestCase(String name) or TestCase() > > at junit.framework.Assert.fail(Assert.java:57) > > at junit.framework.TestCase.fail(TestCase.java:227) > > at junit.framework.TestSuite$1.runTest(TestSuite.java:100) > > > But that seems like an issue with the Cascading test code. I'll check > w/Chris and see what he says. > > Anyway, the build worked with the update to cascading-wip-56. > > I also tried updating to Flink 1.0.0 (from 0.10.0), but so far I've run > into some compilation errors, e.g. in FlinkFlowStep.java it can't find the > JavaPlan class. > > Thanks again for the help, > > -- Ken > > > > > " > > Best, Fabian > > > > 2016-03-29 20:36 GMT+02:00 Ken Krugler <kkrugler_li...@transpac.com>: > > > >> An update (and a nudge)… > >> > >> So far it's been more than 20 hours, and the tests are still running. > >> > >> Most tests seem to fail with one of two different errors… > >> > >> 1. Address already in use > >> > >> cascading.flow.FlowException: [test] unhandled exception > >> at cascading.flow.BaseFlow.complete(BaseFlow.java:977) > >> at > >> > cascading.flow.FlowStrategiesPlatformTest.testSkipStrategiesReplace(FlowStrategiesPlatformTest.java:67) > >> Caused by: org.jboss.netty.channel.ChannelException: Failed to bind to: > / > >> 127.0.0.1:6123 > >> at > >> org.jboss.netty.bootstrap.ServerBootstrap.bind(ServerBootstrap.java:272) > >> … > >> Caused by: java.net.BindException: Address already in use > >> … > >> > >> 2. FlowStepJob.blockOnJob throws a cascading.flow.FlowException > >> > >> All caused by a 100 second timeout > >> > >> Is the above expected? > >> > >> Thanks, > >> > >> -- Ken > >> > >>> From: Ken Krugler > >>> Sent: March 28, 2016 3:39:12pm PDT > >>> To: dev@flink.apache.org > >>> Subject: Expected duration for cascading-flink tests? > >>> > >>> Hi all, > >>> > >>> I'm curious how long the tests are expected to take for > cascading-flink. > >>> > >>> I know that https://github.com/dataArtisans/cascading-flink recommends > >> running mvn clean install with -DskipTests, but I was going to try > updating > >> to flink 1.0.0 (currently using 0.10.0) and cascading 3.1.0-wip-56 > >> (currently on wip-39), so I wanted to first verify that all tests passed > >> before updating and then running the tests again. > >>> > >>> In any case, the tests have been running for about 2.5 hours now. From > >> what I can tell, it's legit - most of the time is tied to > >> cascading.flow.planner.rul.RuleSetExec's call() method. > >>> > >>> Maybe this is a sign that it's time for a new Mac :) > >>> > >>> Thanks, > >>> > >>> -- Ken > > -------------------------- > Ken Krugler > +1 530-210-6378 > http://www.scaleunlimited.com > custom big data solutions & training > Hadoop, Cascading, Cassandra & Solr > > > > > >