Re: [DISCUSS] Dealing with differing Scala versions in Runner dependencies

Kenneth Knowles Tue, 07 Nov 2017 09:17:27 -0800

Is there a JIRA for this issue? I've just cut
https://github.com/apache/beam/pull/4093 since I don't think we actually
need them in some of those.


I believe the precommit is the only place where we actually need both, to
run the examples' integration tests with each runner (albeit currently in
local mode for most runners).

The only solutions I can think of are two executions/jobs or separate
modules that set things up explicitly. I believe the latter is probably
more robust to changes in the build, like so:

runners/flink/examples-integration-tests
    -> runners/flink
    -> examples/java

Now this module should be able to be pretty flexible to do what it needs to
do. Our dependency graph won't be intuitive from the directory structure,
so maybe it should have a different home (maybe all ITs by their nature
should live alongside the other modules).

On Tue, Nov 7, 2017 at 7:38 AM, Aljoscha Krettek <aljos...@apache.org>
wrote:

> I'd like to do it yes, but I've been swamped lately with Flink work.
>
> Also, the situation with Jenkins/Maven (especially the Pipeline Job
> changes) make it somewhat unclear how I should proceed because I don't want
> to add expensive pre-commit hooks. The number of profiles that would need
> splitting are also quite high:
>
> ~/D/i/.test-infra (master|✔) $ ag flink-runner
> jenkins/job_beam_PreCommit_Java_MavenInstall.groovy
> 47:    '--activate-profiles release,jenkins-precommit,
> direct-runner,dataflow-runner,spark-runner,flink-runner,apex-runner',
>
> jenkins/job_beam_PreCommit_Python_MavenInstall.groovy
> 47:    --activate-profiles release,jenkins-precommit,
> direct-runner,dataflow-runner,spark-runner,flink-runner,apex-runner \
>
> jenkins/job_beam_Java_UnitTest.groovy
> 35:    'flink-runner',
>
> jenkins/job_beam_PreCommit_Go_MavenInstall.groovy
> 47:    --activate-profiles release,jenkins-precommit,
> direct-runner,dataflow-runner,spark-runner,flink-runner,apex-runner \
>
> jenkins/job_beam_Java_Build.groovy
> 51:    'flink-runner',
>
> jenkins/job_beam_Java_IntegrationTest.groovy
> 36:    'flink-runner',
> 48:    'flink-runner-integration-tests',
>
> These are all profiles where we build with several runner profiles active
> at the same time.
>
> Unfortunately, this is becoming somewhat pressing now since Flink 1.4 will
> drop support for Scala 2.10 support, meaning we have to update the Flink
> Runner to 2.11 deps.
>
> > On 27. Oct 2017, at 09:52, Kenneth Knowles <k...@google.com.INVALID>
> wrote:
> >
> > You are entirely correct in how you would pull this off - groovy files
> and
> > tweaking the profiles. Seeding is done daily, or also by commenting "Run
> > Seed Job" on a pull request. One thing to consider, in light of recent
> > conversations, is making new jobs that are by post-commit and by request
> > only, or multi-step in order to avoid running lots of extra tests, etc.
> >
> > Do you think you might have time to work on this goal of splitting apart
> > jobs that require splitting?
> >
> >
> > On Wed, Oct 11, 2017 at 2:08 AM, Aljoscha Krettek <aljos...@apache.org>
> > wrote:
> >
> >> I also like option 2 (allowing differing dependencies for runners)
> better.
> >> With the current situation this would mean splitting
> >> PreCommit_Java_MavenInstall (and possibly also
> >> PreCommit_Python_MavenInstall and PreCommit_Go_MavenInstall) into
> separate
> >> jobs. For my goals splitting into one job for "direct-runner,dataflow-
> >> runner,spark-runner,apex-runner" and one for "flank-runner" would be
> >> enough so we should probably go with that until we have the "custom
> make"
> >> solution.
> >>
> >> What do you think?
> >>
> >> @Jason For pulling this off I would copy the groovy files in .test-infra
> >> and change the --activate-profiles line, right? Are there still manual
> >> steps required for "re-seeding" the jobs?
> >>
> >>
> >>
> >>> On 9. Oct 2017, at 18:06, Kenneth Knowles <k...@google.com.INVALID>
> >> wrote:
> >>>
> >>> +1 to the goal, and replying inline on details.
> >>>
> >>> On Mon, Oct 9, 2017 at 8:06 AM, Aljoscha Krettek <aljos...@apache.org>
> >>> wrote:
> >>>
> >>>>
> >>>> - We want to update the Flink dependencies to _2.11 dependencies
> because
> >>>> 2.10 is quite outdated
> >>>
> >>> - This doesn't work well because some modules (examples, for example)
> >>>> depend on all Runners and at least the Spark Runner has _2.10
> >> dependencies
> >>>>
> >>>
> >>> Let's expedite making this possible, and circle back to getting the
> build
> >>> to an ideal state after unblocking this very reasonable change.
> >>>
> >>> It is not reasonable for any runner dependencies in Beam to be coupled,
> >>> certainly not Scala version. We've been monolithic so far because it is
> >>> easier to manage, but it was never a long term solution. It will mean
> >> that
> >>> the examples cannot have -P flink-runner and -P spark-runner at the
> same
> >>> time. But my position is that we should never expect to be able to use
> >> two
> >>> such profiles at the same time.
> >>>
> >>> Of course, if it is technically feasible to transition both runners (I
> >>> don't really know about Spark's coupling with Scala versions) that is
> >> event
> >>> easier and defers the larger issue for a bit.
> >>>
> >>> I see two solutions for this:
> >>>> - Introducing a project-wide Scala version property
> >>>> - Allowing differing Scala versions for different runners, ensure that
> >> we
> >>>> never have a situation where we have several Runners as dependency.
> For
> >> the
> >>>> the "Maven Install" pre-commit hook this could mean splitting it up
> per
> >>>> runner.
> >>>>
> >>>
> >>> I support the latter regardless. I want separate configurations for
> >>> separate runners, fully embracing the fact that they can diverge.
> >>>
> >>> We already intend to split up the build to be much more fine grained.
> The
> >>> only reason we have a monolithic precommit is Maven's extraordinary
> lack
> >> of
> >>> support for any other way of building. We have essentially started to
> >> build
> >>> something akin to a particular invocation of "make" in the form of
> >>> interrelated Jenkins jobs to work around Maven's limitations [1]. Until
> >> we
> >>> can get that approach working at 100% I am fine with splitting builds
> >>> aggressively. You will find that it is quite hard not to duplicate a
> lot
> >> of
> >>> work when splitting them, unless we abandon the Jenkins plugin and use
> a
> >>> sequence of carefully crafted shell commands.
> >>>
> >>> Kenn
> >>>
> >>> [1]
> >>> https://github.com/apache/beam/blob/master/.test-infra/
> >> jenkins/PreCommit_Pipeline.groovy
> >>
> >>
>
>

Re: [DISCUSS] Dealing with differing Scala versions in Runner dependencies

Reply via email to