Re: [Apache Spark Jenkins] build system shutting down Dec 23th, 2021
> > Will you be nuking all the Jenkins-related code in the repo after the 23rd? > > probably not right away... but soon after jenkins is shut down. bits of the docs and spark website will need to be updated as well. shane -- Shane Knapp Computer Guy / Voice of Reason UC Berkeley EECS Research / RISELab Staff Technical Lead https://rise.cs.berkeley.edu
Re: Time for Spark 3.2.1?
+1 for new releases. Dongjoon. On Mon, Dec 6, 2021 at 8:51 PM Wenchen Fan wrote: > +1 to make new maintenance releases for all 3.x branches. > > On Tue, Dec 7, 2021 at 8:57 AM Sean Owen wrote: > >> Always fine by me if someone wants to roll a release. >> >> It's been ~6 months since the last 3.0.x and 3.1.x releases, too; a new >> release of those wouldn't hurt either, if any of our release managers have >> the time or inclination. 3.0.x is reaching unofficial end-of-life around >> now anyway. >> >> >> On Mon, Dec 6, 2021 at 6:55 PM Hyukjin Kwon wrote: >> >>> Hi all, >>> >>> It's been two months since Spark 3.2.0 release, and we have resolved >>> many bug fixes and regressions. What do you guys think about rolling Spark >>> 3.2.1 release? >>> >>> cc @huaxin gao FYI who I happened to overhear >>> that is interested in rolling the maintenance release :-). >>> >>
Re: Time for Spark 3.2.1?
Oh BTW, I realised that it's a holiday season soon this month including Christmas and new year. Shall we maybe start rolling the release around next January? I would leave it to @huaxin gao :-). On Wed, 8 Dec 2021 at 06:19, Dongjoon Hyun wrote: > +1 for new releases. > > Dongjoon. > > On Mon, Dec 6, 2021 at 8:51 PM Wenchen Fan wrote: > >> +1 to make new maintenance releases for all 3.x branches. >> >> On Tue, Dec 7, 2021 at 8:57 AM Sean Owen wrote: >> >>> Always fine by me if someone wants to roll a release. >>> >>> It's been ~6 months since the last 3.0.x and 3.1.x releases, too; a new >>> release of those wouldn't hurt either, if any of our release managers have >>> the time or inclination. 3.0.x is reaching unofficial end-of-life around >>> now anyway. >>> >>> >>> On Mon, Dec 6, 2021 at 6:55 PM Hyukjin Kwon wrote: >>> Hi all, It's been two months since Spark 3.2.0 release, and we have resolved many bug fixes and regressions. What do you guys think about rolling Spark 3.2.1 release? cc @huaxin gao FYI who I happened to overhear that is interested in rolling the maintenance release :-). >>>
Re: Time for Spark 3.2.1?
I prefer to start rolling the release in January if there is no need to publish it sooner :) On Tue, Dec 7, 2021 at 3:59 PM Hyukjin Kwon wrote: > Oh BTW, I realised that it's a holiday season soon this month including > Christmas and new year. > Shall we maybe start rolling the release around next January? I would > leave it to @huaxin gao :-). > > On Wed, 8 Dec 2021 at 06:19, Dongjoon Hyun > wrote: > >> +1 for new releases. >> >> Dongjoon. >> >> On Mon, Dec 6, 2021 at 8:51 PM Wenchen Fan wrote: >> >>> +1 to make new maintenance releases for all 3.x branches. >>> >>> On Tue, Dec 7, 2021 at 8:57 AM Sean Owen wrote: >>> Always fine by me if someone wants to roll a release. It's been ~6 months since the last 3.0.x and 3.1.x releases, too; a new release of those wouldn't hurt either, if any of our release managers have the time or inclination. 3.0.x is reaching unofficial end-of-life around now anyway. On Mon, Dec 6, 2021 at 6:55 PM Hyukjin Kwon wrote: > Hi all, > > It's been two months since Spark 3.2.0 release, and we have resolved > many bug fixes and regressions. What do you guys think about rolling Spark > 3.2.1 release? > > cc @huaxin gao FYI who I happened to > overhear that is interested in rolling the maintenance release :-). >
Re: Time for Spark 3.2.1?
SGTM! On Wed, 8 Dec 2021 at 09:07, huaxin gao wrote: > I prefer to start rolling the release in January if there is no need to > publish it sooner :) > > On Tue, Dec 7, 2021 at 3:59 PM Hyukjin Kwon wrote: > >> Oh BTW, I realised that it's a holiday season soon this month including >> Christmas and new year. >> Shall we maybe start rolling the release around next January? I would >> leave it to @huaxin gao :-). >> >> On Wed, 8 Dec 2021 at 06:19, Dongjoon Hyun >> wrote: >> >>> +1 for new releases. >>> >>> Dongjoon. >>> >>> On Mon, Dec 6, 2021 at 8:51 PM Wenchen Fan wrote: >>> +1 to make new maintenance releases for all 3.x branches. On Tue, Dec 7, 2021 at 8:57 AM Sean Owen wrote: > Always fine by me if someone wants to roll a release. > > It's been ~6 months since the last 3.0.x and 3.1.x releases, too; a > new release of those wouldn't hurt either, if any of our release managers > have the time or inclination. 3.0.x is reaching unofficial end-of-life > around now anyway. > > > On Mon, Dec 6, 2021 at 6:55 PM Hyukjin Kwon > wrote: > >> Hi all, >> >> It's been two months since Spark 3.2.0 release, and we have resolved >> many bug fixes and regressions. What do you guys think about rolling >> Spark >> 3.2.1 release? >> >> cc @huaxin gao FYI who I happened to >> overhear that is interested in rolling the maintenance release :-). >> >
Re: [Apache Spark Jenkins] build system shutting down Dec 23th, 2021
created an issue to track stuff: https://issues.apache.org/jira/browse/SPARK-37571 On Tue, Dec 7, 2021 at 8:25 AM shane knapp ☠ wrote: > Will you be nuking all the Jenkins-related code in the repo after the 23rd? >> >> probably not right away... but soon after jenkins is shut down. bits of > the docs and spark website will need to be updated as well. > > shane > -- > Shane Knapp > Computer Guy / Voice of Reason > UC Berkeley EECS Research / RISELab Staff Technical Lead > https://rise.cs.berkeley.edu > -- Shane Knapp Computer Guy / Voice of Reason UC Berkeley EECS Research / RISELab Staff Technical Lead https://rise.cs.berkeley.edu
Re: Time for Spark 3.2.1?
+1 for new maintenance releases for all 3.x branches as well. On Wed, Dec 8, 2021 at 8:19 AM Hyukjin Kwon wrote: > SGTM! > > On Wed, 8 Dec 2021 at 09:07, huaxin gao wrote: > >> I prefer to start rolling the release in January if there is no need to >> publish it sooner :) >> >> On Tue, Dec 7, 2021 at 3:59 PM Hyukjin Kwon wrote: >> >>> Oh BTW, I realised that it's a holiday season soon this month including >>> Christmas and new year. >>> Shall we maybe start rolling the release around next January? I would >>> leave it to @huaxin gao :-). >>> >>> On Wed, 8 Dec 2021 at 06:19, Dongjoon Hyun >>> wrote: >>> +1 for new releases. Dongjoon. On Mon, Dec 6, 2021 at 8:51 PM Wenchen Fan wrote: > +1 to make new maintenance releases for all 3.x branches. > > On Tue, Dec 7, 2021 at 8:57 AM Sean Owen wrote: > >> Always fine by me if someone wants to roll a release. >> >> It's been ~6 months since the last 3.0.x and 3.1.x releases, too; a >> new release of those wouldn't hurt either, if any of our release managers >> have the time or inclination. 3.0.x is reaching unofficial end-of-life >> around now anyway. >> >> >> On Mon, Dec 6, 2021 at 6:55 PM Hyukjin Kwon >> wrote: >> >>> Hi all, >>> >>> It's been two months since Spark 3.2.0 release, and we have resolved >>> many bug fixes and regressions. What do you guys think about rolling >>> Spark >>> 3.2.1 release? >>> >>> cc @huaxin gao FYI who I happened to >>> overhear that is interested in rolling the maintenance release :-). >>> >>
Re: [Apache Spark Jenkins] build system shutting down Dec 23th, 2021
Thanks for the works, Shane! On Wed, Dec 8, 2021 at 9:19 AM shane knapp ☠ wrote: > created an issue to track stuff: > > https://issues.apache.org/jira/browse/SPARK-37571 > > On Tue, Dec 7, 2021 at 8:25 AM shane knapp ☠ wrote: > >> Will you be nuking all the Jenkins-related code in the repo after the >>> 23rd? >>> >>> probably not right away... but soon after jenkins is shut down. bits >> of the docs and spark website will need to be updated as well. >> >> shane >> -- >> Shane Knapp >> Computer Guy / Voice of Reason >> UC Berkeley EECS Research / RISELab Staff Technical Lead >> https://rise.cs.berkeley.edu >> > > > -- > Shane Knapp > Computer Guy / Voice of Reason > UC Berkeley EECS Research / RISELab Staff Technical Lead > https://rise.cs.berkeley.edu >
Re: Time for Spark 3.2.1?
+1 for maintenance release, and also +1 for doing this in Jan ! Thanks, Mridul On Tue, Dec 7, 2021 at 11:41 PM Gengliang Wang wrote: > +1 for new maintenance releases for all 3.x branches as well. > > On Wed, Dec 8, 2021 at 8:19 AM Hyukjin Kwon wrote: > >> SGTM! >> >> On Wed, 8 Dec 2021 at 09:07, huaxin gao wrote: >> >>> I prefer to start rolling the release in January if there is no need to >>> publish it sooner :) >>> >>> On Tue, Dec 7, 2021 at 3:59 PM Hyukjin Kwon wrote: >>> Oh BTW, I realised that it's a holiday season soon this month including Christmas and new year. Shall we maybe start rolling the release around next January? I would leave it to @huaxin gao :-). On Wed, 8 Dec 2021 at 06:19, Dongjoon Hyun wrote: > +1 for new releases. > > Dongjoon. > > On Mon, Dec 6, 2021 at 8:51 PM Wenchen Fan > wrote: > >> +1 to make new maintenance releases for all 3.x branches. >> >> On Tue, Dec 7, 2021 at 8:57 AM Sean Owen wrote: >> >>> Always fine by me if someone wants to roll a release. >>> >>> It's been ~6 months since the last 3.0.x and 3.1.x releases, too; a >>> new release of those wouldn't hurt either, if any of our release >>> managers >>> have the time or inclination. 3.0.x is reaching unofficial end-of-life >>> around now anyway. >>> >>> >>> On Mon, Dec 6, 2021 at 6:55 PM Hyukjin Kwon >>> wrote: >>> Hi all, It's been two months since Spark 3.2.0 release, and we have resolved many bug fixes and regressions. What do you guys think about rolling Spark 3.2.1 release? cc @huaxin gao FYI who I happened to overhear that is interested in rolling the maintenance release :-). >>>
Re: Time for Spark 3.2.1?
+1 for both releases and the time! On Wed, Dec 8, 2021 at 3:46 PM Mridul Muralidharan wrote: > > +1 for maintenance release, and also +1 for doing this in Jan ! > > Thanks, > Mridul > > On Tue, Dec 7, 2021 at 11:41 PM Gengliang Wang wrote: > >> +1 for new maintenance releases for all 3.x branches as well. >> >> On Wed, Dec 8, 2021 at 8:19 AM Hyukjin Kwon wrote: >> >>> SGTM! >>> >>> On Wed, 8 Dec 2021 at 09:07, huaxin gao wrote: >>> I prefer to start rolling the release in January if there is no need to publish it sooner :) On Tue, Dec 7, 2021 at 3:59 PM Hyukjin Kwon wrote: > Oh BTW, I realised that it's a holiday season soon this month > including Christmas and new year. > Shall we maybe start rolling the release around next January? I would > leave it to @huaxin gao :-). > > On Wed, 8 Dec 2021 at 06:19, Dongjoon Hyun > wrote: > >> +1 for new releases. >> >> Dongjoon. >> >> On Mon, Dec 6, 2021 at 8:51 PM Wenchen Fan >> wrote: >> >>> +1 to make new maintenance releases for all 3.x branches. >>> >>> On Tue, Dec 7, 2021 at 8:57 AM Sean Owen wrote: >>> Always fine by me if someone wants to roll a release. It's been ~6 months since the last 3.0.x and 3.1.x releases, too; a new release of those wouldn't hurt either, if any of our release managers have the time or inclination. 3.0.x is reaching unofficial end-of-life around now anyway. On Mon, Dec 6, 2021 at 6:55 PM Hyukjin Kwon wrote: > Hi all, > > It's been two months since Spark 3.2.0 release, and we have > resolved many bug fixes and regressions. What do you guys think about > rolling Spark 3.2.1 release? > > cc @huaxin gao FYI who I happened to > overhear that is interested in rolling the maintenance release :-). >
[Proposal] Deprecate Trigger.Once and replace with Trigger.AvailableNow
Hi dev, I would like to hear voices about deprecating Trigger.Once, and replacing it with Trigger.AvailableNow [1] in Structured Streaming. Rationalization: The expected behavior of Trigger.Once is like reading all available data after the last trigger and processing them. This holds true when the last run was gracefully terminated, but there are cases streaming queries to not be terminated gracefully. There is a possibility the last run may write the offset (WAL) for the new batch before termination, then a new run of Trigger.Once only processes the data which was built in the latest unfinished batch, and doesn't process new data. The behavior is not deterministic from the users' point of view, as end users wouldn't know whether the last run wrote the offset or not, unless they look into the query's checkpoint by themselves. While Trigger.AvailableNow came to solve the scalability issue on Trigger.Once, it also ensures that it tries to process all available data at the point of time it is triggered, which consistently works as expected behavior of Trigger.Once. Proposed Plan: - Deprecate Trigger.Once in Apache Spark 3.3 - Leave guidance to migrate to Trigger.AvailableNow in migration guide - Replace all usages of Trigger.Once with Trigger.AvailableNow, except the test cases of Trigger.Once itself Please review the proposal and share your voice on this. Thanks! Jungtaek Lim 1. https://issues.apache.org/jira/browse/SPARK-36533