Thank you Zoltan for all this work. I see many PRs are merged based on the new workflow already. The old workflow generates many reports like ASF license/findbugs/checkstyle etc. I don't see these in the new Github PR workflow. I am concerned the codebase is going to suffer from lack of these reports very quickly. Do these checks still happen but are not visible?
On Tue, Jun 2, 2020 at 4:41 AM Zoltan Haindrich <k...@rxd.hu> wrote: > Hello, > > I would like to note that you may login to the jenkins instance - to > start/kill builds (or create new jobs). > I've configured github oauth - but since team membership can't be queried > from the "apache organization" - it's harder to configure all "hive > committers". > However...I think I've made it available for most of us - if you can't > start builds/etc just let me know your github user and I'll add it. > > cheers, > Zoltan > > > > On 5/29/20 2:32 PM, Zoltan Haindrich wrote: > > Hey all! > > > > The patch is now in master - so every new PR or a push on it will > trigger a new run. > > > > Please decide which one would you like to use - open a PR to see the new > one work...or upload a patch file to the jira - but please don't do both; > because in that case 2 > > execution will happen. > > > > The job execution time(2-4 hours) of a single run is a bit higher than > the usual on the ptest server - this is mostly to increase throughput. > > > > The patch also disabled a set of tests; I will send the full list of > skipped tests shortly. > > > > cheers, > > Zoltan > > > > > > On 5/27/20 1:50 PM, Zoltan Haindrich wrote: > >> Hello all! > >> > >> The new stuff is ready to be switched on-to. It needs to be merged into > master - and after that anyone who opens a PR will get a run by the new > HiveQA infra. > >> I propose to run the 2 systems side-by-side for some time - the regular > master builds will start; and we will see how frequently that is polluted > by flaky tests. > >> > >> Note that the current patch also disables around ~25 more tests to > increase stability - to get a better overview about the disabled tests I > think the "direction of the > >> information flow" should be altered; what I mean by that is: instead of > just throwing in a jira for "disable test x" and opening a new one like > "fix test x"; only open > >> the latter and place the jira reference into the ignore message; > meanwhile also add a regular report about the actually disabled tests - so > people who do know about the > >> importance of a particular test can get involved. > >> > >> Note: the builds.apache.org instance will be shutdown somewhere in the > future as well...but I think the new one is a good-enough alternative to > not have to migrate the > >> Hive-precommit job over to https://ci-hadoop.apache.org/. > >> > >> http://34.66.156.144:8080/job/hive-precommit/job/PR-948/5/ > >> https://issues.apache.org/jira/browse/HIVE-22942 > >> https://github.com/apache/hive/pull/948/files > >> > >> cheers, > >> Zoltan > >> > >> On 5/18/20 1:42 PM, Zoltan Haindrich wrote: > >>> Hey! > >>> > >>> On 5/18/20 11:51 AM, Zoltan Chovan wrote: > >>>> Thank you for all of your efforts, this looks really promising. With > moving > >>>> to github PRs, would that also mean that we move away from the > reviewboard > >>>> for code review? > >>> I didn't thinked about that. I think using github's review interface > will remain optional, because both review systems has there own strong > points - I wouldn't force > >>> anyone to use one over the other. (For some patches reviewboard is > much better; because it's able to track content moves a bit better than > github. - meanwhile github has > >>> a small feature that enables to mark files as reviewed) > >>> As a matter of fact we had sometimes patches on the jira's which never > had neither an RB or a PR to review them - having a PR there at least will > make it easier for > >>> reviewers to comment. > >>> > >>>> Also, what happens if a PR is updated? Will the tests run for both or > just > >>>> for the latest version? > >>> It will trigger a new build - if there is already a build in progress > that will prevent a new build from starting until it finishes...and there > is also a 5 builds/day > >>> limit; which might induce some wait. > >>> > >>> cheers, > >>> Zoltan > >>> > >>>> > >>>> Regards, > >>>> Zoltan > >>>> > >>>> On Sun, May 17, 2020 at 10:51 PM Zoltan Haindrich <k...@rxd.hu> > wrote: > >>>> > >>>>> Hello all! > >>>>> > >>>>> The proposed system have become more stable lately - and I think I've > >>>>> solved a few sources of flakiness. > >>>>> To be really usable I also wanted to add a way to dynamically > >>>>> enable/disable a set of tests (for example the replication tests > take ~7 > >>>>> hours to execute from the total of 24 > >>>>> hours - and they are also a bit unstable, so not running them when > not > >>>>> neccesary would be beneficial in multiple ways) - but to do this the > best > >>>>> would be to throw in > >>>>> junit5; unfortunately the current ptest installation uses maven 3.0.5 > >>>>> which doesn't like these kind of things - so instead of hacking a > fix for > >>>>> that ....I've removed it > >>>>> from the dev branch for now. > >>>>> > >>>>> I would like to propose to start an evaluation phase of the new test > >>>>> procedures(INFRA-20269) > >>>>> The process would look something like this: > >>>>> * someone opens a PR - the tests will be run on the changes > >>>>> * on every active branches the tests will run from time to time > >>>>> * this will produce a bunch of test runs on the master branch as > well ; > >>>>> which will show how well the tests behave on the master branch > without any > >>>>> patches > >>>>> * runs on branches (PRs or active development branches(eg:master)) > will be > >>>>> rate limited to 5 builds/day > >>>>> * at most ~4 builds at a time - to maximize resource usage > >>>>> * turnaround time for a build is right now 2 hours - which I feel > like a > >>>>> balanced choice between speed/response time > >>>>> > >>>>> Possible future benefits: > >>>>> * toggle features using github tags > >>>>> * optional testgroups (metastore/replication) tests > >>>>> * ability to run the metastore verification tests > >>>>> * possibility to add smoke tests > >>>>> > >>>>> To enable this I will have to finish the HIVE-22942 ticket - beyond > the > >>>>> new Jenkinsfile which defines the full logic; > >>>>> although I've sinked a lot of time into fixing all kind of flaky > tests I > >>>>> would would like to disable around ~25 tests. > >>>>> > >>>>> I also would like to propose a method to verify the stability of a > single > >>>>> test: run it a 100 times in series at the same place where the > precommit > >>>>> tests are running. > >>>>> This will put the bar high enough that only totally stable tests > could > >>>>> satisfy it (a 99% stable test has 36% chance to pass this without > being > >>>>> caught :D) > >>>>> After this will be in service it could be used to: validate that an > >>>>> existing test is unstable (before disabling it) - and then used > again to > >>>>> prove that it got fixed during > >>>>> re-enabling it. > >>>>> > >>>>> Please let me know what you think! > >>>>> > >>>>> cheers, > >>>>> Zoltan > >>>>> > >>>>> > >>>>> > >>>>> On 4/29/20 4:28 PM, Zoltan Haindrich wrote: > >>>>>> Hey All! > >>>>>> > >>>>>> I was planning to replace the ptest stuff with something less > complex > >>>>> for a while now - I see that we struggle a lot because of ptest is > more > >>>>> complicated than it should be... > >>>>>> It would be much better if it would be constructed from well made > >>>>> existing CI piece. - because of that I've started working on [1] a > few > >>>>> months ago. > >>>>>> > >>>>>> It has it's pros and cons...but it's not the same as the existing > ptest > >>>>> stuff. > >>>>>> I've collected some infos about how it compares against the > existing one > >>>>> - but it became too long so I've moved it into a google docs > document at > >>>>> [3]. > >>>>>> > >>>>>> It's not yet ready... I still have some remaining > problems/concerns/etc > >>>>>> * what do you think about changing to a github PR based workflow? > >>>>>> * it will not support at all things like "isolation" - so we will > have > >>>>> to make our tests work with eachother without bending the rules... > >>>>>> * I've tried to overcommit the cpu resources which creates a more > noisy > >>>>> environment for the actual tests - this squeezes out some new > problems > >>>>> which should be fixed before > >>>>>> this could be enabled. > >>>>>> * for every PR the first run is somewhat sub-optimal...there are > some > >>>>> reasons for this - the actually used resources are the same; but the > >>>>> overall execution time is not > >>>>>> optimal; I could accept this as a compromise because right now I > wait > >>>>>> 24 hours for a precommit run. > >>>>>> > >>>>>> It's deployed at [2] and anyone can start a testrun on it: > >>>>>> * merge my HIVE-22942-ptest-alt branch from [4] into your branch > >>>>>> * open a PR against my hive repo on github [5] > >>>>>> > >>>>>> cheers, > >>>>>> Zoltan > >>>>>> > >>>>>> > >>>>>> [1] https://issues.apache.org/jira/browse/HIVE-22942 > >>>>>> [2] http://34.66.156.144:8080/job/hive-precommit > >>>>>> [3] > >>>>> > https://docs.google.com/document/d/1dhL5B-eBvYNKEsNV3kE6RrkV5w-LtDgw5CtHV5pdoX4/edit?usp=sharing > >>>>>> [4] https://github.com/kgyrtkirk/hive/tree/HIVE-22942-ptest-alt > >>>>>> [5] https://github.com/kgyrtkirk/hive/ > >>>>> > >>>> >