Thank you again, Emil and Bjorn. FYI, SPARK-44678 landed at branch-3.5 like the following.
https://github.com/apache/spark/pull/42345 [SPARK-44678][BUILD][3.5] Downgrade Hadoop to 3.3.4 Dongjoon. On 2023/08/02 18:58:51 Bjørn Jørgensen wrote: > @Dongjoon Hyun <dongjoon.h...@gmail.com> FYI > [image: image.png] > > We better ask common-...@hadoop.apache.org. > > ons. 2. aug. 2023 kl. 18:03 skrev Dongjoon Hyun <dongjoon.h...@gmail.com>: > > > Oh, I got it, Emil and Bjorn. > > > > Dongjoon. > > > > On Wed, Aug 2, 2023 at 12:32 AM Bjørn Jørgensen <bjornjorgen...@gmail.com> > > wrote: > > > >> "*As far as I can tell this makes both 3.3.5 and 3.3.6 unusable with s3 > >> without providing an alternative committer code.*" > >> > >> https://github.com/apache/hadoop/pull/5706#issuecomment-1619927992 > >> > >> ons. 2. aug. 2023 kl. 08:05 skrev Emil Ejbyfeldt > >> <eejbyfe...@liveintent.com.invalid>: > >> > >>> > Apache Spark is not affected by HADOOP-18757 because it is not a part > >>> of > >>> > both Apache Hadoop 3.3.5 and 3.3.6. > >>> > >>> I am not sure I am following what you are trying to say here. Is that > >>> the jira is saying that only 3.3.5 is affected? Here I think the Jira is > >>> just incorrect. The jira was created (and the PR with the fix) was > >>> created before 3.3.6 was released and I just think the jira has not been > >>> updated to reflect the fact that 3.3.6 is also affected. > >>> > >>> > HADOOP-18757 seems to be merged just two weeks ago and there is no > >>> > Apache Hadoop release with it, isn't it? > >>> > >>> That is correct, there is no hadoop release containing the fix. So > >>> therefore 3.3.6 would also be affected by the regression. > >>> > >>> Best, > >>> Emil > >>> > >>> On 02/08/2023 07:51, Dongjoon Hyun wrote: > >>> > It's still invalid information, Emil. > >>> > > >>> > Apache Spark is not affected by HADOOP-18757 because it is not a part > >>> of > >>> > both Apache Hadoop 3.3.5 and 3.3.6. > >>> > > >>> > HADOOP-18757 seems to be merged just two weeks ago and there is no > >>> > Apache Hadoop release with it, isn't it? > >>> > > >>> > Could you check your local branch once more, please? > >>> > > >>> > Dongjoon. > >>> > > >>> > > >>> > > >>> > On Tue, Aug 1, 2023 at 9:46 PM Emil Ejbyfeldt < > >>> eejbyfe...@liveintent.com > >>> > <mailto:eejbyfe...@liveintent.com>> wrote: > >>> > > >>> > Hi, > >>> > > >>> > Yes, sorry about that seem to have messed up the link. Should have > >>> been > >>> > https://issues.apache.org/jira/browse/HADOOP-18757 > >>> > <https://issues.apache.org/jira/browse/HADOOP-18757> > >>> > > >>> > Best, > >>> > Emil > >>> > > >>> > On 01/08/2023 19:08, Dongjoon Hyun wrote: > >>> > > Hi, Emil. > >>> > > > >>> > > HADOOP-18568 is still open and it seems to be never a part of > >>> the > >>> > Hadoop > >>> > > trunk branch. > >>> > > > >>> > > Do you mean another JIRA? > >>> > > > >>> > > Dongjoon. > >>> > > > >>> > > > >>> > > > >>> > > On Tue, Aug 1, 2023 at 2:59 AM Emil Ejbyfeldt > >>> > > <eejbyfe...@liveintent.com > >>> > <mailto:eejbyfe...@liveintent.com>.invalid> wrote: > >>> > > > >>> > > Hi, > >>> > > > >>> > > We previously ran some experiments on builds from the 3.5 > >>> > branch and > >>> > > noticed that Hadoop had a regression > >>> > > (https://issues.apache.org/jira/browse/HADOOP-18568 > >>> > <https://issues.apache.org/jira/browse/HADOOP-18568> > >>> > > <https://issues.apache.org/jira/browse/HADOOP-18568 > >>> > <https://issues.apache.org/jira/browse/HADOOP-18568>>) in their > >>> s3a > >>> > > committer affecting 3.3.5 and 3.3.6 (Spark 3.4 uses hadoop > >>> > 3.3.4). This > >>> > > fix has been merged into Hadoop and will be part the next > >>> > release of > >>> > > Hadoop. > >>> > > > >>> > > From our testing the regression when writing data to S3 > >>> > with large > >>> > > number of tasks S3 is severe enough that we would need to > >>> > revert to > >>> > > hadoop 3.3.4 in order to use spark 3.5 release. > >>> > > > >>> > > Since it only for S3 I am not sure it warrants action > >>> changes > >>> > in Spark > >>> > > (e.g rolling back hadoop to 3.3.4). But it probably > >>> something > >>> > people > >>> > > testing the rc against s3 should be aware of. > >>> > > > >>> > > Best, > >>> > > Emil > >>> > > > >>> > > On 29/07/2023 10:29, Yuanjian Li wrote: > >>> > > > Hi everyone, > >>> > > > > >>> > > > Following the release timeline, I will cut the RC > >>> > on*Tuesday, Aug > >>> > > 1st at > >>> > > > 1 pm PST* as scheduled. > >>> > > > > >>> > > > Date Event > >>> > > > July 17th 2023 > >>> > > > Late July > >>> > > > 2023 Code freeze. Release branch cut. > >>> > > > QA period. Focus on bug fixes, tests, stability and docs. > >>> > > > Generally, no new features merged. > >>> > > > > >>> > > > > >>> > > > August 2023 Release candidates (RC), voting, etc. until > >>> > final > >>> > > release passes > >>> > > > > >>> > > > > >>> > > > Best, > >>> > > > Yuanjian > >>> > > > >>> > > > >>> > > >>> --------------------------------------------------------------------- > >>> > > To unsubscribe e-mail: dev-unsubscr...@spark.apache.org > >>> > <mailto:dev-unsubscr...@spark.apache.org> > >>> > > <mailto:dev-unsubscr...@spark.apache.org > >>> > <mailto:dev-unsubscr...@spark.apache.org>> > >>> > > > >>> > > >>> > >>> --------------------------------------------------------------------- > >>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org > >>> > >>> > >> > >> -- > >> Bjørn Jørgensen > >> Vestre Aspehaug 4, 6010 Ålesund > >> Norge > >> > >> +47 480 94 297 > >> > > > > -- > Bjørn Jørgensen > Vestre Aspehaug 4, 6010 Ålesund > Norge > > +47 480 94 297 > --------------------------------------------------------------------- To unsubscribe e-mail: dev-unsubscr...@spark.apache.org