Also throwing my hat in for two of my PRs that should be ready just need final reviews/approval: Removing shuffles from deallocated executors using the shuffle service: https://github.com/apache/spark/pull/35085. This has been asked for for several years across many issues. Configurable memory overhead factor: https://github.com/apache/spark/pull/35504
Adam On Wed, Mar 16, 2022 at 8:53 AM Wenchen Fan <cloud0...@gmail.com> wrote: > +1 to define an allowlist of features that we want to backport to branch > 3.3. I also have a few in my mind > complex type support in vectorized parquet reader: > https://github.com/apache/spark/pull/34659 > refine the DS v2 filter API for JDBC v2: > https://github.com/apache/spark/pull/35768 > a few new SQL functions that have been in development for a while: > to_char, split_part, percentile_disc, try_sum, etc. > > On Wed, Mar 16, 2022 at 2:41 PM Maxim Gekk > <maxim.g...@databricks.com.invalid> wrote: > >> Hi All, >> >> I have created the branch for Spark 3.3: >> https://github.com/apache/spark/commits/branch-3.3 >> >> Please, backport important fixes to it, and if you have some doubts, ping >> me in the PR. Regarding new features, we are still building the allow list >> for branch-3.3. >> >> Best regards, >> Max Gekk >> >> >> On Wed, Mar 16, 2022 at 5:51 AM Dongjoon Hyun <dongjoon.h...@gmail.com> >> wrote: >> >>> Yes, I agree with you for your whitelist approach for backporting. :) >>> Thank you for summarizing. >>> >>> Thanks, >>> Dongjoon. >>> >>> >>> On Tue, Mar 15, 2022 at 4:20 PM Xiao Li <gatorsm...@gmail.com> wrote: >>> >>>> I think I finally got your point. What you want to keep unchanged is >>>> the branch cut date of Spark 3.3. Today? or this Friday? This is not a big >>>> deal. >>>> >>>> My major concern is whether we should keep merging the feature work or >>>> the dependency upgrade after the branch cut. To make our release time more >>>> predictable, I am suggesting we should finalize the exception PR list >>>> first, instead of merging them in an ad hoc way. In the past, we spent a >>>> lot of time on the revert of the PRs that were merged after the branch cut. >>>> I hope we can minimize unnecessary arguments in this release. Do you agree, >>>> Dongjoon? >>>> >>>> >>>> >>>> Dongjoon Hyun <dongjoon.h...@gmail.com> 于2022年3月15日周二 15:55写道: >>>> >>>>> That is not totally fine, Xiao. It sounds like you are asking a change >>>>> of plan without a proper reason. >>>>> >>>>> Although we cut the branch Today according our plan, you still can >>>>> collect the list and make a list of exceptions. I'm not blocking what you >>>>> want to do. >>>>> >>>>> Please let the community start to ramp down as we agreed before. >>>>> >>>>> Dongjoon >>>>> >>>>> >>>>> >>>>> On Tue, Mar 15, 2022 at 3:07 PM Xiao Li <gatorsm...@gmail.com> wrote: >>>>> >>>>>> Please do not get me wrong. If we don't cut a branch, we are allowing >>>>>> all patches to land Apache Spark 3.3. That is totally fine. After we cut >>>>>> the branch, we should avoid merging the feature work. In the next three >>>>>> days, let us collect the actively developed PRs that we want to make an >>>>>> exception (i.e., merged to 3.3 after the upcoming branch cut). Does that >>>>>> make sense? >>>>>> >>>>>> Dongjoon Hyun <dongjoon.h...@gmail.com> 于2022年3月15日周二 14:54写道: >>>>>> >>>>>>> Xiao. You are working against what you are saying. >>>>>>> If you don't cut a branch, it means you are allowing all patches to >>>>>>> land Apache Spark 3.3. No? >>>>>>> >>>>>>> > we need to avoid backporting the feature work that are not being >>>>>>> well discussed. >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Tue, Mar 15, 2022 at 12:12 PM Xiao Li <gatorsm...@gmail.com> >>>>>>> wrote: >>>>>>> >>>>>>>> Cutting the branch is simple, but we need to avoid backporting the >>>>>>>> feature work that are not being well discussed. Not all the members are >>>>>>>> actively following the dev list. I think we should wait 3 more days for >>>>>>>> collecting the PR list before cutting the branch. >>>>>>>> >>>>>>>> BTW, there are very few 3.4-only feature work that will be affected. >>>>>>>> >>>>>>>> Xiao >>>>>>>> >>>>>>>> Dongjoon Hyun <dongjoon.h...@gmail.com> 于2022年3月15日周二 11:49写道: >>>>>>>> >>>>>>>>> Hi, Max, Chao, Xiao, Holden and all. >>>>>>>>> >>>>>>>>> I have a different idea. >>>>>>>>> >>>>>>>>> Given the situation and small patch list, I don't think we need to >>>>>>>>> postpone the branch cut for those patches. It's easier to cut a >>>>>>>>> branch-3.3 >>>>>>>>> and allow backporting. >>>>>>>>> >>>>>>>>> As of today, we already have an obvious Apache Spark 3.4 patch in >>>>>>>>> the branch together. This situation only becomes worse and worse >>>>>>>>> because >>>>>>>>> there is no way to block the other patches from landing >>>>>>>>> unintentionally if >>>>>>>>> we don't cut a branch. >>>>>>>>> >>>>>>>>> [SPARK-38335][SQL] Implement parser support for DEFAULT column >>>>>>>>> values >>>>>>>>> >>>>>>>>> Let's cut `branch-3.3` Today for Apache Spark 3.3.0 preparation. >>>>>>>>> >>>>>>>>> Best, >>>>>>>>> Dongjoon. >>>>>>>>> >>>>>>>>> >>>>>>>>> On Tue, Mar 15, 2022 at 10:17 AM Chao Sun <sunc...@apache.org> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> Cool, thanks for clarifying! >>>>>>>>>> >>>>>>>>>> On Tue, Mar 15, 2022 at 10:11 AM Xiao Li <gatorsm...@gmail.com> >>>>>>>>>> wrote: >>>>>>>>>> >> >>>>>>>>>> >> For the following list: >>>>>>>>>> >> #35789 [SPARK-32268][SQL] Row-level Runtime Filtering >>>>>>>>>> >> #34659 [SPARK-34863][SQL] Support complex types for Parquet >>>>>>>>>> vectorized reader >>>>>>>>>> >> #35848 [SPARK-38548][SQL] New SQL function: try_sum >>>>>>>>>> >> Do you mean we should include them, or exclude them from 3.3? >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > If possible, I hope these features can be shipped with Spark >>>>>>>>>> 3.3. >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > Chao Sun <sunc...@apache.org> 于2022年3月15日周二 10:06写道: >>>>>>>>>> >> >>>>>>>>>> >> Hi Xiao, >>>>>>>>>> >> >>>>>>>>>> >> For the following list: >>>>>>>>>> >> >>>>>>>>>> >> #35789 [SPARK-32268][SQL] Row-level Runtime Filtering >>>>>>>>>> >> #34659 [SPARK-34863][SQL] Support complex types for Parquet >>>>>>>>>> vectorized reader >>>>>>>>>> >> #35848 [SPARK-38548][SQL] New SQL function: try_sum >>>>>>>>>> >> >>>>>>>>>> >> Do you mean we should include them, or exclude them from 3.3? >>>>>>>>>> >> >>>>>>>>>> >> Thanks, >>>>>>>>>> >> Chao >>>>>>>>>> >> >>>>>>>>>> >> On Tue, Mar 15, 2022 at 9:56 AM Dongjoon Hyun < >>>>>>>>>> dongjoon.h...@gmail.com> wrote: >>>>>>>>>> >> > >>>>>>>>>> >> > The following was tested and merged a few minutes ago. So, >>>>>>>>>> we can remove it from the list. >>>>>>>>>> >> > >>>>>>>>>> >> > #35819 [SPARK-38524][SPARK-38553][K8S] Bump Volcano to v1.5.1 >>>>>>>>>> >> > >>>>>>>>>> >> > Thanks, >>>>>>>>>> >> > Dongjoon. >>>>>>>>>> >> > >>>>>>>>>> >> > On Tue, Mar 15, 2022 at 9:48 AM Xiao Li < >>>>>>>>>> gatorsm...@gmail.com> wrote: >>>>>>>>>> >> >> >>>>>>>>>> >> >> Let me clarify my above suggestion. Maybe we can wait 3 >>>>>>>>>> more days to collect the list of actively developed PRs that we want >>>>>>>>>> to >>>>>>>>>> merge to 3.3 after the branch cut? >>>>>>>>>> >> >> >>>>>>>>>> >> >> Please do not rush to merge the PRs that are not fully >>>>>>>>>> reviewed. We can cut the branch this Friday and continue merging the >>>>>>>>>> PRs >>>>>>>>>> that have been discussed in this thread. Does that make sense? >>>>>>>>>> >> >> >>>>>>>>>> >> >> Xiao >>>>>>>>>> >> >> >>>>>>>>>> >> >> >>>>>>>>>> >> >> >>>>>>>>>> >> >> Holden Karau <hol...@pigscanfly.ca> 于2022年3月15日周二 09:10写道: >>>>>>>>>> >> >>> >>>>>>>>>> >> >>> May I suggest we push out one week (22nd) just to give >>>>>>>>>> everyone a bit of breathing space? Rushed software development more >>>>>>>>>> often >>>>>>>>>> results in bugs. >>>>>>>>>> >> >>> >>>>>>>>>> >> >>> On Tue, Mar 15, 2022 at 6:23 AM Yikun Jiang < >>>>>>>>>> yikunk...@gmail.com> wrote: >>>>>>>>>> >> >>>> >>>>>>>>>> >> >>>> > To make our release time more predictable, let us >>>>>>>>>> collect the PRs and wait three more days before the branch cut? >>>>>>>>>> >> >>>> >>>>>>>>>> >> >>>> For SPIP: Support Customized Kubernetes Schedulers: >>>>>>>>>> >> >>>> #35819 [SPARK-38524][SPARK-38553][K8S] Bump Volcano to >>>>>>>>>> v1.5.1 >>>>>>>>>> >> >>>> >>>>>>>>>> >> >>>> Three more days are OK for this from my view. >>>>>>>>>> >> >>>> >>>>>>>>>> >> >>>> Regards, >>>>>>>>>> >> >>>> Yikun >>>>>>>>>> >> >>> >>>>>>>>>> >> >>> -- >>>>>>>>>> >> >>> Twitter: https://twitter.com/holdenkarau >>>>>>>>>> >> >>> Books (Learning Spark, High Performance Spark, etc.): >>>>>>>>>> https://amzn.to/2MaRAG9 >>>>>>>>>> >> >>> YouTube Live Streams: >>>>>>>>>> https://www.youtube.com/user/holdenkarau >>>>>>>>>> >>>>>>>>> -- Adam Binford