Re: [PROPOSITION] schedule some sanity tests on a daily basis

Etienne Chauchot Thu, 15 Mar 2018 14:13:37 -0700
So what next? Shall we schedule nexmark runs and add a Bigquery sink to nexmark 
output?
Le lundi 12 mars 2018 à 10:30 +0100, Etienne Chauchot a écrit :
> Thanks everyone for your comments and support.
> 
> Le vendredi 09 mars 2018 à 21:28 +0000, Alan Myrvold a écrit :
> > Great ideas. I want to see a daily signal for anything that could prevent a 
> > release from happening, and precommits
> > that are fast and reliable for areas that are commonly broken by code 
> > changes.
> > 
> > We are now running the java quickstarts daily on a cron schedule, using 
> > direct, dataflow, and local spark and flink
> > in the beam_PostRelease_NightlySnapshot job, see 
> > https://github.com/apache/beam/blob/master/release/build.gradle
> > This should provide a good signal for the examples integration tests 
> > against these runners.
> > 
> > As Kenn noted, the java_maveninstall also runs lots of tests. It would be 
> > good to be more clear and intentional
> > about which tests run when, and to consider implementing additional "always 
> > up" environments for use by the tests.
> > 
> > Having the nexmark smoke tests run regularly and stored in a database would 
> > really enhance our efforts, perhaps
> > starting with directrunner for the performance tests.
> Yes
> 
> > 
> > What area would have the most immediate impact? Nexmark smoke tests?
> Yes IMHO I think that Nexmark smoke tests would have a great return on 
> investment. By just scheduling some of them (at
> first),  we enable deep confidence in the runners on real user pipelines. In 
> the past Nexmark has allowed to discover
> regressions in performance before a release and also to discover some bugs in 
> some runners. But, please note that, for
> this last ability, Nexmark is limited currently: it only detects failures if 
> an exception is thrown, there is no check
> of the correctness of the output PCollection because the aim was performance 
> tests and there is no point adding a slow
> test for correctness. Nevertheless, if we store the output size (as I 
> suggested in this thread), we can get a hint on
> a failure if the output size is different from the last stored output sizes.
> 
> Etienne
> 
> > 
> > 
> > 
> > 
> > On Fri, Mar 9, 2018 at 12:57 PM Kenneth Knowles <k...@google.com> wrote:
> > > On Fri, Mar 9, 2018 at 3:08 AM Etienne Chauchot <echauc...@apache.org> 
> > > wrote:
> > > > Hi guys,
> > > > 
> > > > I was looking at the various jenkins jobs and I wanted to submit a 
> > > > proposition:
> > > > 
> > > > - Validates runner tests: currently run at PostCommit for all the 
> > > > runners. I think it is the quickest way to see
> > > > regressions. So keep it that way
> > > We've also toyed with precommit for runners where it is fast.
> > >  
> > > > - Integration tests: AFAIK we only run the ones in examples module and 
> > > > only on demand. What about running all
> > > > the IT (in
> > > > particular IO IT) as a cron job on a daily basis with direct runner? 
> > > > Please note that it will require some
> > > > always up
> > > > backend infrastructure.
> > > I like this idea. We actually run more, but in postcommit. You can see 
> > > the goal here: https://github.com/apache/be
> > > am/blob/master/.test-infra/jenkins/job_beam_PostCommit_Java_MavenInstall.groovy#L47
> > > 
> > > There's no infrastructure set up that I see. It is only DirectRunner and 
> > > DataflowRunner currently, as they are
> > > "always up". But so could be local Flink and Spark. Do the ITs spin up 
> > > local versions of what they are connecting
> > > to?
> > > 
> > > If we have adequate resources, I also think ValidatesRunner on a real 
> > > cluster would add value, once we have the
> > > cluster set up / tear down or "always up".
> > >  
> > > > - Performance tests: what about running Nexmark SMOKE test suite in 
> > > > batch and streaming modes with all the
> > > > runners on a
> > > > daily basis and store the running times in a RRD database (to see 
> > > > performance regressions)?
> > > I like this idea, too. I think we could do DirectRunner (and probably 
> > > local Flink) as postcommit without being too
> > > expensive.
> > > 
> > > Kenn
> > > 
> > >  
> > > > Please note that not all the
> > > > queries run in all the runners in all the modes right now. Also, we 
> > > > have some streaming pipelines termination
> > > > issues
> > > > (see https://issues.apache.org/jira/browse/BEAM-2847)
> > > > 
> > > > I know that Stephen Sisk use to work on these topics. I also talked to 
> > > > guys from Polidea. But As I understood,
> > > > they
> > > > launch mainly integration tests on Dataflow runner.
> > > > 
> > > > WDYT?
> > > > 
> > > > Etienne
> > > > 
> > > > 
> > > >
Re: [PROPOSITION] schedule some sanity tests on a daily basis

Reply via email to