Hi team,

Thanks Vihang for looking into this. I have commented on the JIRA you created.

Just to bring everyone's notice, I have seen that there has been a couple of 
pushes to branch-3, which has lead to 5 more new test failures. The test 
failures are in orc_merge1, orc_merge2, orc_merge3, orc_merge4 and orc_merge10. 
These tests did not use to fail before. I would sincerely urge the community to 
raise a PR against branch-3, so that the Jenkins pipeline can run and then only 
merge things to branch-3. We had 2900+ failures when we started 2 months back 
and now having brought it down to less than 15, new failures again has pushed 
us back in this effort.

I would like to thank everyone who has participated in this effort and made it 
possible till this stage. Also, if the contributors can take ownership of these 
new test case failures and fix them, it will be of great help.

Thanks,
Aman.
________________________________
From: vihang karajgaonkar <vihan...@apache.org>
Sent: Friday, February 17, 2023 6:10 AM
To: dev@hive.apache.org <dev@hive.apache.org>
Subject: Re: [EXTERNAL] Re: Branch-3 backports and build stability

[You don't often get email from vihan...@apache.org. Learn why this is 
important at https://aka.ms/LearnAboutSenderIdentification ]

Hi Aman,

I created 
https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissues.apache.org%2Fjira%2Fbrowse%2FHIVE-27087&data=05%7C01%7Crajaman%40microsoft.com%7C7cc87475f1fe4036bcd308db107faf36%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C638121912852386975%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=E7FD0nKrKQq%2F297DlTgJog365lH4Q0Xa8I2zEGgwtQY%3D&reserved=0
 to look into
TestMiniSparkOnYarnCliDriver failures. I have a working theory of what
might be going on there. I am still investigating what is the right way to
fix it though.

Thanks,
Vihang

On Fri, Feb 10, 2023 at 10:26 AM Aman Raj <raja...@microsoft.com.invalid>
wrote:

> Hi Vihang,
>
> Yes the tests are failing locally as well with the same issue.
>
> Thanks,
> Aman.
>
> Get Outlook for 
> Android<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Faka.ms%2FAAb9ysg&data=05%7C01%7Crajaman%40microsoft.com%7C7cc87475f1fe4036bcd308db107faf36%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C638121912852386975%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=XbUx9nnHQKtIdemDWtNB8W%2BoAN9r997WjFOZlJLhBH8%3D&reserved=0>
> ________________________________
> From: Vihang Karajgaonkar <vihang.karajgaon...@databricks.com.INVALID>
> Sent: Friday, February 10, 2023 11:22:15 PM
> To: dev@hive.apache.org <dev@hive.apache.org>
> Subject: Re: [EXTERNAL] Re: Branch-3 backports and build stability
>
> [You don't often get email from vihang.karajgaon...@databricks.com.invalid.
> Learn why this is important at
> https://aka.ms/LearnAboutSenderIdentification ]
>
> Thanks a lot Stamatis for starting this thread. I really appreciate all the
> efforts to stabilize branch-3 to get it to a releasable state and I agree
> that we should get it to a green state before opening it for PRs not
> related to test failures. I can help with the effort as well.
>
> If we want to get the branch back to green state soon, have we considered
> disabling the tests which are clearly flaky? (e.g pass on some builds and
> fail on the other build with no new code changes). If we don't do that, we
> will keep playing whack a mole with those tests. I propose for such tests
> we should disable them and create tickets to unflake them separately. This
> will help us get back to a green state faster.
>
> Hi Aman,
> For TestMiniSparkOnYarnCliDriver failures, you probably should also look
> into the spark driver/application logs and see if there are infrastructure
> errors (e.g OOMs). Are these tests failing when you run locally?
>
> Thanks,
> Vihang
>
> On Tue, Feb 7, 2023 at 10:05 PM Aman Raj <raja...@microsoft.com.invalid>
> wrote:
>
> > +1,
> > Thanks Stamatis and Lazlo for helping in the test case fixes till now.
> >
> > Team,
> > I need help in fixing the following tests in Hive. I have tried different
> > approaches but no luck till now.
> > I am facing some issues in fixing the following tests :
> > org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver
> >
> > Issue :
> > PREHOOK: Input: default@src
> > PREHOOK: Output: default@src
> > Failed to monitor Job[-1] with exception
> > 'java.lang.IllegalStateException(Connection to remote Spark driver was
> > lost)' Last known state = SENT
> > Failed to execute spark task, with exception
> > 'java.lang.IllegalStateException(RPC channel is closed.)'
> > FAILED: Execution Error, return code 1 from
> > org.apache.hadoop.hive.ql.exec.spark.SparkTask. RPC channel is closed.
> >
> > History :
> > Initially the tests had failed with errors which I fixed in the following
> > task :
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissues.apache.org%2Fjira%2Fbrowse%2FHIVE-26940&data=05%7C01%7Crajaman%40microsoft.com%7C7cc87475f1fe4036bcd308db107faf36%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C638121912852386975%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=qIgZVHldffGFLL7MERtkVwv8QFOPwrM49JD97BH%2Bku0%3D&reserved=0
> >
> > Does anyone know what the issue is here ? There are 6-7 failures because
> > of this test case. Link to the failed test cases for the stacktrace :
> >
> https://nam06.safelinks.protection.outlook.com/?url=http%3A%2F%2Fci.hive.apache.org%2Fblue%2Forganizations%2Fjenkins%2Fhive-precommit%2Fdetail%2FPR-3949%2F2%2Ftests%2F&data=05%7C01%7Crajaman%40microsoft.com%7C7cc87475f1fe4036bcd308db107faf36%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C638121912852386975%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=B4nrnCh%2B2tC2OKYwzN81y8iHb30b2OaRMcZX3gQie2Y%3D&reserved=0
> > Thanks,
> > Aman.
> >
> > ________________________________
> > From: László Bodor <bodorlaszlo0...@gmail.com>
> > Sent: Tuesday, February 7, 2023 4:46 PM
> > To: dev@hive.apache.org <dev@hive.apache.org>
> > Subject: [EXTERNAL] Re: Branch-3 backports and build stability
> >
> > +1
> > also, if I merged something that I thought was for test stability (but
> > instead it was a feature), excuse me :)
> > for reference, the whole green test initiative is tracked under this
> > umbrella:
> >
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissues.apache.org%2Fjira%2Fbrowse%2FHIVE-26836&data=05%7C01%7Crajaman%40microsoft.com%7C7cc87475f1fe4036bcd308db107faf36%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C638121912852386975%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=Ainj7oCYknhYIHVmXITj4zBoo9466%2Bqof9ZIYkVnh44%3D&reserved=0
> >
> > Stamatis Zampetakis <zabe...@gmail.com> ezt írta (időpont: 2023. febr.
> 7.,
> > K, 12:09):
> >
> > > Hi all,
> > >
> > > The build in branch-3 is not yet green; there are ~25 test failures. It
> > is
> > > a common practice that we shouldn't push changes on top of a broken
> build
> > > unless they are addressing test failures.
> > >
> > > Some people (mainly Aman Raj, Chris Nauroth, and Laszlo Bodor) are
> > working
> > > hard to stabilize the build for quite some time now. If you want to
> help
> > > out then start by reviewing, merging, and fixing things around test
> > > failures.
> > >
> > > It's not yet the time to bring new features, upgrades, bugs, etc., in
> > > branch-3. I would encourage  committers to not approve such changes
> till
> > we
> > > get back to a stable branch.
> > >
> > > Best,
> > > Stamatis
> > >
> >
>

Reply via email to