Re: [DISCUSS] Dealing with differing Scala versions in Runner dependencies

Lukasz Cwik Wed, 08 Nov 2017 13:37:56 -0800

An alternative would be to contribute to the Gradle based build and
configure a test set with the specific dependencies/configurations that are
needed on a per runner basis without needing to activate a mixed bag of
profiles which lead to runner based dependency conflicts/resolution.


On Tue, Nov 7, 2017 at 9:16 AM, Kenneth Knowles <k...@google.com.invalid>
wrote:

> Is there a JIRA for this issue? I've just cut
> https://github.com/apache/beam/pull/4093 since I don't think we actually
> need them in some of those.
>
> I believe the precommit is the only place where we actually need both, to
> run the examples' integration tests with each runner (albeit currently in
> local mode for most runners).
>
> The only solutions I can think of are two executions/jobs or separate
> modules that set things up explicitly. I believe the latter is probably
> more robust to changes in the build, like so:
>
> runners/flink/examples-integration-tests
>     -> runners/flink
>     -> examples/java
>
> Now this module should be able to be pretty flexible to do what it needs to
> do. Our dependency graph won't be intuitive from the directory structure,
> so maybe it should have a different home (maybe all ITs by their nature
> should live alongside the other modules).
>
> On Tue, Nov 7, 2017 at 7:38 AM, Aljoscha Krettek <aljos...@apache.org>
> wrote:
>
> > I'd like to do it yes, but I've been swamped lately with Flink work.
> >
> > Also, the situation with Jenkins/Maven (especially the Pipeline Job
> > changes) make it somewhat unclear how I should proceed because I don't
> want
> > to add expensive pre-commit hooks. The number of profiles that would need
> > splitting are also quite high:
> >
> > ~/D/i/.test-infra (master|✔) $ ag flink-runner
> > jenkins/job_beam_PreCommit_Java_MavenInstall.groovy
> > 47:    '--activate-profiles release,jenkins-precommit,
> > direct-runner,dataflow-runner,spark-runner,flink-runner,apex-runner',
> >
> > jenkins/job_beam_PreCommit_Python_MavenInstall.groovy
> > 47:    --activate-profiles release,jenkins-precommit,
> > direct-runner,dataflow-runner,spark-runner,flink-runner,apex-runner \
> >
> > jenkins/job_beam_Java_UnitTest.groovy
> > 35:    'flink-runner',
> >
> > jenkins/job_beam_PreCommit_Go_MavenInstall.groovy
> > 47:    --activate-profiles release,jenkins-precommit,
> > direct-runner,dataflow-runner,spark-runner,flink-runner,apex-runner \
> >
> > jenkins/job_beam_Java_Build.groovy
> > 51:    'flink-runner',
> >
> > jenkins/job_beam_Java_IntegrationTest.groovy
> > 36:    'flink-runner',
> > 48:    'flink-runner-integration-tests',
> >
> > These are all profiles where we build with several runner profiles active
> > at the same time.
> >
> > Unfortunately, this is becoming somewhat pressing now since Flink 1.4
> will
> > drop support for Scala 2.10 support, meaning we have to update the Flink
> > Runner to 2.11 deps.
> >
> > > On 27. Oct 2017, at 09:52, Kenneth Knowles <k...@google.com.INVALID>
> > wrote:
> > >
> > > You are entirely correct in how you would pull this off - groovy files
> > and
> > > tweaking the profiles. Seeding is done daily, or also by commenting
> "Run
> > > Seed Job" on a pull request. One thing to consider, in light of recent
> > > conversations, is making new jobs that are by post-commit and by
> request
> > > only, or multi-step in order to avoid running lots of extra tests, etc.
> > >
> > > Do you think you might have time to work on this goal of splitting
> apart
> > > jobs that require splitting?
> > >
> > >
> > > On Wed, Oct 11, 2017 at 2:08 AM, Aljoscha Krettek <aljos...@apache.org
> >
> > > wrote:
> > >
> > >> I also like option 2 (allowing differing dependencies for runners)
> > better.
> > >> With the current situation this would mean splitting
> > >> PreCommit_Java_MavenInstall (and possibly also
> > >> PreCommit_Python_MavenInstall and PreCommit_Go_MavenInstall) into
> > separate
> > >> jobs. For my goals splitting into one job for "direct-runner,dataflow-
> > >> runner,spark-runner,apex-runner" and one for "flank-runner" would be
> > >> enough so we should probably go with that until we have the "custom
> > make"
> > >> solution.
> > >>
> > >> What do you think?
> > >>
> > >> @Jason For pulling this off I would copy the groovy files in
> .test-infra
> > >> and change the --activate-profiles line, right? Are there still manual
> > >> steps required for "re-seeding" the jobs?
> > >>
> > >>
> > >>
> > >>> On 9. Oct 2017, at 18:06, Kenneth Knowles <k...@google.com.INVALID>
> > >> wrote:
> > >>>
> > >>> +1 to the goal, and replying inline on details.
> > >>>
> > >>> On Mon, Oct 9, 2017 at 8:06 AM, Aljoscha Krettek <
> aljos...@apache.org>
> > >>> wrote:
> > >>>
> > >>>>
> > >>>> - We want to update the Flink dependencies to _2.11 dependencies
> > because
> > >>>> 2.10 is quite outdated
> > >>>
> > >>> - This doesn't work well because some modules (examples, for example)
> > >>>> depend on all Runners and at least the Spark Runner has _2.10
> > >> dependencies
> > >>>>
> > >>>
> > >>> Let's expedite making this possible, and circle back to getting the
> > build
> > >>> to an ideal state after unblocking this very reasonable change.
> > >>>
> > >>> It is not reasonable for any runner dependencies in Beam to be
> coupled,
> > >>> certainly not Scala version. We've been monolithic so far because it
> is
> > >>> easier to manage, but it was never a long term solution. It will mean
> > >> that
> > >>> the examples cannot have -P flink-runner and -P spark-runner at the
> > same
> > >>> time. But my position is that we should never expect to be able to
> use
> > >> two
> > >>> such profiles at the same time.
> > >>>
> > >>> Of course, if it is technically feasible to transition both runners
> (I
> > >>> don't really know about Spark's coupling with Scala versions) that is
> > >> event
> > >>> easier and defers the larger issue for a bit.
> > >>>
> > >>> I see two solutions for this:
> > >>>> - Introducing a project-wide Scala version property
> > >>>> - Allowing differing Scala versions for different runners, ensure
> that
> > >> we
> > >>>> never have a situation where we have several Runners as dependency.
> > For
> > >> the
> > >>>> the "Maven Install" pre-commit hook this could mean splitting it up
> > per
> > >>>> runner.
> > >>>>
> > >>>
> > >>> I support the latter regardless. I want separate configurations for
> > >>> separate runners, fully embracing the fact that they can diverge.
> > >>>
> > >>> We already intend to split up the build to be much more fine grained.
> > The
> > >>> only reason we have a monolithic precommit is Maven's extraordinary
> > lack
> > >> of
> > >>> support for any other way of building. We have essentially started to
> > >> build
> > >>> something akin to a particular invocation of "make" in the form of
> > >>> interrelated Jenkins jobs to work around Maven's limitations [1].
> Until
> > >> we
> > >>> can get that approach working at 100% I am fine with splitting builds
> > >>> aggressively. You will find that it is quite hard not to duplicate a
> > lot
> > >> of
> > >>> work when splitting them, unless we abandon the Jenkins plugin and
> use
> > a
> > >>> sequence of carefully crafted shell commands.
> > >>>
> > >>> Kenn
> > >>>
> > >>> [1]
> > >>> https://github.com/apache/beam/blob/master/.test-infra/
> > >> jenkins/PreCommit_Pipeline.groovy
> > >>
> > >>
> >
> >
>

Re: [DISCUSS] Dealing with differing Scala versions in Runner dependencies

Reply via email to