Hi Ken, I'm currently on vacation and will be back in a week. Would you like to open an issue at the cascading-flink Github project a describe the Scheme.setNumSinkParts() problem? I'll try to fix it when I'm back.
Thanks for checking with Chris the ComparePlatformsTest issue. I'll exclude that test case. Thanks, Fabian 2016-03-30 21:46 GMT+02:00 Ken Krugler <kkrugler_li...@transpac.com>: > Hi Fabian, > > I've been trying out the cascading-flink 1.0 branch (updated to > cascading-3.1-wip-56) with our cascading.utils project. > > I ran into one initial challenge, where older Kryo versions don't work > with Flink - it seems like it has to be 2.24.0, otherwise you get a > no-such-method error (2.19) or an odd hang while Kryo is trying to read > (2.21). So there was a bit of version management required. I noticed that > Flink has a dependency on Chill 0.7.4, which depends on Kryo 2.21. > > After that change, our tests run, but it looks like the Flink planner is > ignore the Scheme.setNumSinkParts() call. > > E.g. Scheme.setNumSinkParts(1) should result in a single part-00000 file, > and thus the upstream grouping should implicitly have a parallelism of 1. > > This is described as a suggestion (e.g. if your Flow only has maps then no > such parallelism can be guaranteed) but it does wind up being relied upon > by many workflows, when generating a small output file that has to be > globally sorted. > > Thanks, > > -- Ken > > PS - Chris Wensel responded to the > cascading.ComparePlatformsTest$CompareTestCase issue, and said: > > > make sure you ‘exclude’ *TestCase from your unit test pattern. > > > > all Cascading tests are *PlatformTest and *Test. there are no tests in > *TestCase > > > > > > From: Fabian Hueske > > Sent: March 30, 2016 2:04:15am PDT > > To: dev@flink.apache.org > > Subject: Re: Expected duration for cascading-flink tests? > > > > Hi Ken, > > > > regarding the failed tests: > > - cascading.JoinFieldedPipesPlatformTest$testJoinMergeGroupBy is expected > > to fail due to restrictions in the MR/Tez engines. If I remember > correctly, > > this is about deadlocks that need to be resolved by splitting a job. > > Flink's optimizer detects such situations and places a dam breaker to > > resolve such a situation within a single job and is hence able to execute > > the job correctly. > > - cascading.ComparePlatformsTest$CompareTestCase I think you are right on > > this one. When I implemented the runner, I did not find a way to make > this > > tests pass. It looked like an issue with the test itself as you assumed > as > > well. > > > > Btw. I ported the runner to Flink 1.0 and bumped the Cascading 3.1 > > WIP version already, but haven't done an "official" release yet. You find > > the code in the flink-1.0 branch [1]. With Flink 1.0, we also extended > the > > support for outer joins. It might be possible to get rid of some of the > > HashJoin restrictions, but I have to take a closer look at how outer hash > > joins are done with Cascading MR/Tez. > > Anyway, I can do a Cascading-Flink release for Flink 1.0 soon and extend > > HashJoin support later. > > > > Best, Fabian > > > > [1] https://github.com/dataartisans/cascading-flink/tree/flink-1.0 > > > > 2016-03-30 6:08 GMT+02:00 Ken Krugler <kkrugler_li...@transpac.com>: > > > >> Hi Fabian, > >> > >>> From: Fabian Hueske > >>> Sent: March 29, 2016 3:51:08pm PDT > >>> To: dev@flink.apache.org > >>> Subject: Re: Expected duration for cascading-flink tests? > >>> > >>> Hi Ken, > >>> > >>> no, this is definitely not expected. The tests complete in about 30 > mins > >> on > >>> my machine. > >>> Is it possible that you have another Flink process running on your > >> machine > >>> (maybe a debug thread in your IDE)? That could explain the "Address > >> already > >>> in use" exceptions. > >> > >> Good call - I'd run "bin/stop-local.sh" previously, but I see that > there's > >> still the Flink process running. > >> > >> Re-running bin/stop-local.sh displays "No jobmanager daemon to stop on > >> host Kens-MacBook-Air.local.", but still doesn't kill off the Flink > process. > >> > >> What might cause that situation? > >> > >> In any case, I manually killed the process and started the build again, > >> and it finished in about 20 minutes, which is great. > >> > >> I see the expected errors, e.g. > >> > >> HashJoin does only support InnerJoin and LeftJoin but is > >> cascading.pipe.joiner.OuterJoin > >> > >> though this one seems odd: > >> > >>> testJoinMergeGroupBy(cascading.JoinFieldedPipesPlatformTest) Time > >> elapsed: 0.048 sec <<< FAILURE! > >>> junit.framework.AssertionFailedError: planner should throw error on > plan > >> > >> FlinkTestPlatform needs to return true from supportsGroupByAfterMerge() > - > >> assuming that this is actually the case (seems reasonable for Flink) > >> > >> Though making that change requires cascading-wip-56 to avoid a > compilation > >> error on the @Override. > >> > >> There's also this one: > >> > >>> Running cascading.ComparePlatformsTest$CompareTestCase > >>> Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 0.053 > >> sec <<< FAILURE! - in cascading.ComparePlatformsTest$CompareTestCase > >>> warning(junit.framework.TestSuite$1) Time elapsed: 0.009 sec <<< > >> FAILURE! > >>> junit.framework.AssertionFailedError: Class > >> cascading.ComparePlatformsTest$CompareTestCase has no public constructor > >> TestCase(String name) or TestCase() > >>> at junit.framework.Assert.fail(Assert.java:57) > >>> at junit.framework.TestCase.fail(TestCase.java:227) > >>> at junit.framework.TestSuite$1.runTest(TestSuite.java:100) > >> > >> > >> But that seems like an issue with the Cascading test code. I'll check > >> w/Chris and see what he says. > >> > >> Anyway, the build worked with the update to cascading-wip-56. > >> > >> I also tried updating to Flink 1.0.0 (from 0.10.0), but so far I've run > >> into some compilation errors, e.g. in FlinkFlowStep.java it can't find > the > >> JavaPlan class. > >> > >> Thanks again for the help, > >> > >> -- Ken > >> > >> > >> > >>> " > >>> Best, Fabian > >>> > >>> 2016-03-29 20:36 GMT+02:00 Ken Krugler <kkrugler_li...@transpac.com>: > >>> > >>>> An update (and a nudge)… > >>>> > >>>> So far it's been more than 20 hours, and the tests are still running. > >>>> > >>>> Most tests seem to fail with one of two different errors… > >>>> > >>>> 1. Address already in use > >>>> > >>>> cascading.flow.FlowException: [test] unhandled exception > >>>> at cascading.flow.BaseFlow.complete(BaseFlow.java:977) > >>>> at > >>>> > >> > cascading.flow.FlowStrategiesPlatformTest.testSkipStrategiesReplace(FlowStrategiesPlatformTest.java:67) > >>>> Caused by: org.jboss.netty.channel.ChannelException: Failed to bind > to: > >> / > >>>> 127.0.0.1:6123 > >>>> at > >>>> > org.jboss.netty.bootstrap.ServerBootstrap.bind(ServerBootstrap.java:272) > >>>> … > >>>> Caused by: java.net.BindException: Address already in use > >>>> … > >>>> > >>>> 2. FlowStepJob.blockOnJob throws a cascading.flow.FlowException > >>>> > >>>> All caused by a 100 second timeout > >>>> > >>>> Is the above expected? > >>>> > >>>> Thanks, > >>>> > >>>> -- Ken > >>>> > >>>>> From: Ken Krugler > >>>>> Sent: March 28, 2016 3:39:12pm PDT > >>>>> To: dev@flink.apache.org > >>>>> Subject: Expected duration for cascading-flink tests? > >>>>> > >>>>> Hi all, > >>>>> > >>>>> I'm curious how long the tests are expected to take for > >> cascading-flink. > >>>>> > >>>>> I know that https://github.com/dataArtisans/cascading-flink > recommends > >>>> running mvn clean install with -DskipTests, but I was going to try > >> updating > >>>> to flink 1.0.0 (currently using 0.10.0) and cascading 3.1.0-wip-56 > >>>> (currently on wip-39), so I wanted to first verify that all tests > passed > >>>> before updating and then running the tests again. > >>>>> > >>>>> In any case, the tests have been running for about 2.5 hours now. > From > >>>> what I can tell, it's legit - most of the time is tied to > >>>> cascading.flow.planner.rul.RuleSetExec's call() method. > >>>>> > >>>>> Maybe this is a sign that it's time for a new Mac :) > >>>>> > >>>>> Thanks, > >>>>> > >>>>> -- Ken > > > > > > -------------------------- > Ken Krugler > +1 530-210-6378 > http://www.scaleunlimited.com > custom big data solutions & training > Hadoop, Cascading, Cassandra & Solr > > > > > >