Re: Spark 3.1 branch cut 4th Dec?
Hi, Xiao. I agree. > Merging the feature work after the branch cut should not be encouraged in general, although some committers did make some exceptions based on their own judgement. We should try to avoid merging the feature work after the branch cut. So, the Apache Spark community accepted your request for delay already. (Early November to Early December) - https://github.com/apache/spark-website/commit/0cd0bdc80503882b4737db7e77cc8f9d17ec12ca I don't think the branch cut should be delayed again. We don't need to have two weeks after Hyukjin's email. Given the delay, I'd strongly recommend to cut the branch on 1st December. I'll create a `branch-3.1` on 1st December if Hyujkjin is busy to start to stabilize . Again, it will not block you if you have an exceptional request. However, it would be helpful for all of us if you make it clear what features you are waiting for now. We are creating Apache Spark together. Bests, Dongjoon. On Thu, Nov 19, 2020 at 11:38 PM Xiao Li wrote: > Correction: > > Merging the feature work after the branch cut should not be encouraged in > general, although some committers did make some exceptions based on their > own judgement. We should try to avoid merging the feature work after the > branch cut. > > This email is a good reminder message. At least, we have two weeks > ahead of the proposed branch cut date. I hope each feature owner might > hurry up and try to finish it before the branch cut. > > Xiao > > Xiao Li 于2020年11月19日周四 下午11:36写道: > >> We should try to merge the feature work after the branch cut. This should >> not be encouraged in general, although some committers did make some >> exceptions based on their own judgement. >> >> This email is a good reminder message. At least, we have two weeks >> ahead of the proposed branch cut date. I hope each feature owner might >> hurry up and try to finish it before the branch cut. >> >> Xiao >> >> Dongjoon Hyun 于2020年11月19日周四 下午4:02写道: >> >>> Thank you for your volunteering! >>> >>> Since the previous branch-cuts were always soft-code freeze which >>> allowed committers to merge to the new branches still for a while, I >>> believe 1st December will be better for stabilization. >>> >>> Bests, >>> Dongjoon. >>> >>> >>> On Thu, Nov 19, 2020 at 3:50 PM Hyukjin Kwon >>> wrote: >>> Hi all, I think we haven’t decided yet the exact branch-cut, code freeze and release manager. As we planned in https://spark.apache.org/versioning-policy.html Early Dec 2020 Code freeze. Release branch cut Code freeze and branch cutting is coming. Therefore, we should finish if there are any remaining works for Spark 3.1, and switch to QA mode soon. I think it’s time to set to keep it on track, and I would like to volunteer to help drive this process. I am currently thinking 4th Dec as the branch-cut date. Any thoughts? Thanks all.
Re: Spark 3.1 branch cut 4th Dec?
Hi, Dongjoon, Thank you for your feedback. I think *Early December* does not mean we will cut the branch on Dec 1st. I do not think Dec 1st and Dec 4th are a big deal. Normally, it would be nice to give enough buffer. Based on my understanding, this email is just a *proposal* and a *reminder*. In the past, we often got mixed feedbacks. Anyway, we are collecting the feedbacks from the whole community. Welcome the inputs from everyone else Thanks, Xiao Dongjoon Hyun 于2020年11月20日周五 上午8:33写道: > Hi, Xiao. > > I agree. > > > Merging the feature work after the branch cut should not be > encouraged in general, although some committers did make some exceptions > based on their own judgement. We should try to avoid merging the feature > work after the branch cut. > > So, the Apache Spark community accepted your request for delay already. > (Early November to Early December) > > - > https://github.com/apache/spark-website/commit/0cd0bdc80503882b4737db7e77cc8f9d17ec12ca > > I don't think the branch cut should be delayed again. We don't need to > have two weeks after Hyukjin's email. > > Given the delay, I'd strongly recommend to cut the branch on 1st December. > > I'll create a `branch-3.1` on 1st December if Hyujkjin is busy to start to > stabilize . > > Again, it will not block you if you have an exceptional request. > > However, it would be helpful for all of us if you make it clear what > features you are waiting for now. > > We are creating Apache Spark together. > > Bests, > Dongjoon. > > > On Thu, Nov 19, 2020 at 11:38 PM Xiao Li wrote: > >> Correction: >> >> Merging the feature work after the branch cut should not be encouraged in >> general, although some committers did make some exceptions based on their >> own judgement. We should try to avoid merging the feature work after the >> branch cut. >> >> This email is a good reminder message. At least, we have two weeks >> ahead of the proposed branch cut date. I hope each feature owner might >> hurry up and try to finish it before the branch cut. >> >> Xiao >> >> Xiao Li 于2020年11月19日周四 下午11:36写道: >> >>> We should try to merge the feature work after the branch cut. This >>> should not be encouraged in general, although some committers did make some >>> exceptions based on their own judgement. >>> >>> This email is a good reminder message. At least, we have two weeks >>> ahead of the proposed branch cut date. I hope each feature owner might >>> hurry up and try to finish it before the branch cut. >>> >>> Xiao >>> >>> Dongjoon Hyun 于2020年11月19日周四 下午4:02写道: >>> Thank you for your volunteering! Since the previous branch-cuts were always soft-code freeze which allowed committers to merge to the new branches still for a while, I believe 1st December will be better for stabilization. Bests, Dongjoon. On Thu, Nov 19, 2020 at 3:50 PM Hyukjin Kwon wrote: > Hi all, > > I think we haven’t decided yet the exact branch-cut, code freeze and > release manager. > > As we planned in https://spark.apache.org/versioning-policy.html > > Early Dec 2020 Code freeze. Release branch cut > > Code freeze and branch cutting is coming. > > Therefore, we should finish if there are any remaining works for Spark > 3.1, and > switch to QA mode soon. > I think it’s time to set to keep it on track, and I would like to > volunteer to help drive this process. > > I am currently thinking 4th Dec as the branch-cut date. > > Any thoughts? > > Thanks all. > >
Re: Spark 3.1 branch cut 4th Dec?
https://github.com/apache/spark/pull/28026 is the major feature I am tracking. It is painful to keep two sets of CREATE TABLE DDLs with different behaviors. This hurts the usability of our SQL users, based on what I heard. Unfortunately, this PR missed Spark 3.0 release. Now, I think we should try our best to address it in 3.1. Thanks, Xiao Xiao Li 于2020年11月20日周五 上午8:52写道: > Hi, Dongjoon, > > Thank you for your feedback. I think *Early December* does not mean we > will cut the branch on Dec 1st. I do not think Dec 1st and Dec 4th are a > big deal. Normally, it would be nice to give enough buffer. Based on my > understanding, this email is just a *proposal* and a *reminder*. In the > past, we often got mixed feedbacks. > > Anyway, we are collecting the feedbacks from the whole community. Welcome > the inputs from everyone else > > Thanks, > > Xiao > > Dongjoon Hyun 于2020年11月20日周五 上午8:33写道: > >> Hi, Xiao. >> >> I agree. >> >> > Merging the feature work after the branch cut should not be >> encouraged in general, although some committers did make some exceptions >> based on their own judgement. We should try to avoid merging the feature >> work after the branch cut. >> >> So, the Apache Spark community accepted your request for delay already. >> (Early November to Early December) >> >> - >> https://github.com/apache/spark-website/commit/0cd0bdc80503882b4737db7e77cc8f9d17ec12ca >> >> I don't think the branch cut should be delayed again. We don't need to >> have two weeks after Hyukjin's email. >> >> Given the delay, I'd strongly recommend to cut the branch on 1st December. >> >> I'll create a `branch-3.1` on 1st December if Hyujkjin is busy to start >> to stabilize . >> >> Again, it will not block you if you have an exceptional request. >> >> However, it would be helpful for all of us if you make it clear what >> features you are waiting for now. >> >> We are creating Apache Spark together. >> >> Bests, >> Dongjoon. >> >> >> On Thu, Nov 19, 2020 at 11:38 PM Xiao Li wrote: >> >>> Correction: >>> >>> Merging the feature work after the branch cut should not be encouraged >>> in general, although some committers did make some exceptions based on >>> their own judgement. We should try to avoid merging the feature work after >>> the branch cut. >>> >>> This email is a good reminder message. At least, we have two weeks >>> ahead of the proposed branch cut date. I hope each feature owner might >>> hurry up and try to finish it before the branch cut. >>> >>> Xiao >>> >>> Xiao Li 于2020年11月19日周四 下午11:36写道: >>> We should try to merge the feature work after the branch cut. This should not be encouraged in general, although some committers did make some exceptions based on their own judgement. This email is a good reminder message. At least, we have two weeks ahead of the proposed branch cut date. I hope each feature owner might hurry up and try to finish it before the branch cut. Xiao Dongjoon Hyun 于2020年11月19日周四 下午4:02写道: > Thank you for your volunteering! > > Since the previous branch-cuts were always soft-code freeze which > allowed committers to merge to the new branches still for a while, I > believe 1st December will be better for stabilization. > > Bests, > Dongjoon. > > > On Thu, Nov 19, 2020 at 3:50 PM Hyukjin Kwon > wrote: > >> Hi all, >> >> I think we haven’t decided yet the exact branch-cut, code freeze and >> release manager. >> >> As we planned in https://spark.apache.org/versioning-policy.html >> >> Early Dec 2020 Code freeze. Release branch cut >> >> Code freeze and branch cutting is coming. >> >> Therefore, we should finish if there are any remaining works for >> Spark 3.1, and >> switch to QA mode soon. >> I think it’s time to set to keep it on track, and I would like to >> volunteer to help drive this process. >> >> I am currently thinking 4th Dec as the branch-cut date. >> >> Any thoughts? >> >> Thanks all. >> >>
Re: Spark 3.1 branch cut 4th Dec?
Thank you for sharing, Xiao. I hope we are able to make some agreement for CREATE TABLE DDLs, too. Bests, Dongjoon. On Fri, Nov 20, 2020 at 9:01 AM Xiao Li wrote: > https://github.com/apache/spark/pull/28026 is the major feature I am > tracking. It is painful to keep two sets of CREATE TABLE DDLs with > different behaviors. This hurts the usability of our SQL users, based on > what I heard. Unfortunately, this PR missed Spark 3.0 release. Now, I think > we should try our best to address it in 3.1. > > Thanks, > > Xiao > > Xiao Li 于2020年11月20日周五 上午8:52写道: > >> Hi, Dongjoon, >> >> Thank you for your feedback. I think *Early December* does not mean we >> will cut the branch on Dec 1st. I do not think Dec 1st and Dec 4th are a >> big deal. Normally, it would be nice to give enough buffer. Based on my >> understanding, this email is just a *proposal* and a *reminder*. In the >> past, we often got mixed feedbacks. >> >> Anyway, we are collecting the feedbacks from the whole community. Welcome >> the inputs from everyone else >> >> Thanks, >> >> Xiao >> >> Dongjoon Hyun 于2020年11月20日周五 上午8:33写道: >> >>> Hi, Xiao. >>> >>> I agree. >>> >>> > Merging the feature work after the branch cut should not be >>> encouraged in general, although some committers did make some exceptions >>> based on their own judgement. We should try to avoid merging the feature >>> work after the branch cut. >>> >>> So, the Apache Spark community accepted your request for delay already. >>> (Early November to Early December) >>> >>> - >>> https://github.com/apache/spark-website/commit/0cd0bdc80503882b4737db7e77cc8f9d17ec12ca >>> >>> I don't think the branch cut should be delayed again. We don't need to >>> have two weeks after Hyukjin's email. >>> >>> Given the delay, I'd strongly recommend to cut the branch on 1st >>> December. >>> >>> I'll create a `branch-3.1` on 1st December if Hyujkjin is busy to start >>> to stabilize . >>> >>> Again, it will not block you if you have an exceptional request. >>> >>> However, it would be helpful for all of us if you make it clear what >>> features you are waiting for now. >>> >>> We are creating Apache Spark together. >>> >>> Bests, >>> Dongjoon. >>> >>> >>> On Thu, Nov 19, 2020 at 11:38 PM Xiao Li wrote: >>> Correction: Merging the feature work after the branch cut should not be encouraged in general, although some committers did make some exceptions based on their own judgement. We should try to avoid merging the feature work after the branch cut. This email is a good reminder message. At least, we have two weeks ahead of the proposed branch cut date. I hope each feature owner might hurry up and try to finish it before the branch cut. Xiao Xiao Li 于2020年11月19日周四 下午11:36写道: > We should try to merge the feature work after the branch cut. This > should not be encouraged in general, although some committers did make > some > exceptions based on their own judgement. > > This email is a good reminder message. At least, we have two weeks > ahead of the proposed branch cut date. I hope each feature owner might > hurry up and try to finish it before the branch cut. > > Xiao > > Dongjoon Hyun 于2020年11月19日周四 下午4:02写道: > >> Thank you for your volunteering! >> >> Since the previous branch-cuts were always soft-code freeze which >> allowed committers to merge to the new branches still for a while, I >> believe 1st December will be better for stabilization. >> >> Bests, >> Dongjoon. >> >> >> On Thu, Nov 19, 2020 at 3:50 PM Hyukjin Kwon >> wrote: >> >>> Hi all, >>> >>> I think we haven’t decided yet the exact branch-cut, code freeze and >>> release manager. >>> >>> As we planned in https://spark.apache.org/versioning-policy.html >>> >>> Early Dec 2020 Code freeze. Release branch cut >>> >>> Code freeze and branch cutting is coming. >>> >>> Therefore, we should finish if there are any remaining works for >>> Spark 3.1, and >>> switch to QA mode soon. >>> I think it’s time to set to keep it on track, and I would like to >>> volunteer to help drive this process. >>> >>> I am currently thinking 4th Dec as the branch-cut date. >>> >>> Any thoughts? >>> >>> Thanks all. >>> >>>
Re: Spark 3.1 branch cut 4th Dec?
I think we should be able to get the CREATE TABLE changes in. Now that the main blocker (EXTERNAL) has been decided, it's just a matter of normal review comments. On Fri, Nov 20, 2020 at 9:05 AM Dongjoon Hyun wrote: > Thank you for sharing, Xiao. > > I hope we are able to make some agreement for CREATE TABLE DDLs, too. > > Bests, > Dongjoon. > > On Fri, Nov 20, 2020 at 9:01 AM Xiao Li wrote: > >> https://github.com/apache/spark/pull/28026 is the major feature I am >> tracking. It is painful to keep two sets of CREATE TABLE DDLs with >> different behaviors. This hurts the usability of our SQL users, based on >> what I heard. Unfortunately, this PR missed Spark 3.0 release. Now, I think >> we should try our best to address it in 3.1. >> >> Thanks, >> >> Xiao >> >> Xiao Li 于2020年11月20日周五 上午8:52写道: >> >>> Hi, Dongjoon, >>> >>> Thank you for your feedback. I think *Early December* does not mean we >>> will cut the branch on Dec 1st. I do not think Dec 1st and Dec 4th are a >>> big deal. Normally, it would be nice to give enough buffer. Based on my >>> understanding, this email is just a *proposal* and a *reminder*. In the >>> past, we often got mixed feedbacks. >>> >>> Anyway, we are collecting the feedbacks from the whole community. >>> Welcome the inputs from everyone else >>> >>> Thanks, >>> >>> Xiao >>> >>> Dongjoon Hyun 于2020年11月20日周五 上午8:33写道: >>> Hi, Xiao. I agree. > Merging the feature work after the branch cut should not be encouraged in general, although some committers did make some exceptions based on their own judgement. We should try to avoid merging the feature work after the branch cut. So, the Apache Spark community accepted your request for delay already. (Early November to Early December) - https://github.com/apache/spark-website/commit/0cd0bdc80503882b4737db7e77cc8f9d17ec12ca I don't think the branch cut should be delayed again. We don't need to have two weeks after Hyukjin's email. Given the delay, I'd strongly recommend to cut the branch on 1st December. I'll create a `branch-3.1` on 1st December if Hyujkjin is busy to start to stabilize . Again, it will not block you if you have an exceptional request. However, it would be helpful for all of us if you make it clear what features you are waiting for now. We are creating Apache Spark together. Bests, Dongjoon. On Thu, Nov 19, 2020 at 11:38 PM Xiao Li wrote: > Correction: > > Merging the feature work after the branch cut should not be encouraged > in general, although some committers did make some exceptions based on > their own judgement. We should try to avoid merging the feature work after > the branch cut. > > This email is a good reminder message. At least, we have two weeks > ahead of the proposed branch cut date. I hope each feature owner might > hurry up and try to finish it before the branch cut. > > Xiao > > Xiao Li 于2020年11月19日周四 下午11:36写道: > >> We should try to merge the feature work after the branch cut. This >> should not be encouraged in general, although some committers did make >> some >> exceptions based on their own judgement. >> >> This email is a good reminder message. At least, we have two weeks >> ahead of the proposed branch cut date. I hope each feature owner might >> hurry up and try to finish it before the branch cut. >> >> Xiao >> >> Dongjoon Hyun 于2020年11月19日周四 下午4:02写道: >> >>> Thank you for your volunteering! >>> >>> Since the previous branch-cuts were always soft-code freeze which >>> allowed committers to merge to the new branches still for a while, I >>> believe 1st December will be better for stabilization. >>> >>> Bests, >>> Dongjoon. >>> >>> >>> On Thu, Nov 19, 2020 at 3:50 PM Hyukjin Kwon >>> wrote: >>> Hi all, I think we haven’t decided yet the exact branch-cut, code freeze and release manager. As we planned in https://spark.apache.org/versioning-policy.html Early Dec 2020 Code freeze. Release branch cut Code freeze and branch cutting is coming. Therefore, we should finish if there are any remaining works for Spark 3.1, and switch to QA mode soon. I think it’s time to set to keep it on track, and I would like to volunteer to help drive this process. I am currently thinking 4th Dec as the branch-cut date. Any thoughts? Thanks all. -- Ryan Blue Software Engineer Netflix
Re: Spark 3.1 branch cut 4th Dec?
It sounds great! :) Thanks, Ryan. On Fri, Nov 20, 2020 at 9:19 AM Ryan Blue wrote: > I think we should be able to get the CREATE TABLE changes in. Now that the > main blocker (EXTERNAL) has been decided, it's just a matter of normal > review comments. > > On Fri, Nov 20, 2020 at 9:05 AM Dongjoon Hyun > wrote: > >> Thank you for sharing, Xiao. >> >> I hope we are able to make some agreement for CREATE TABLE DDLs, too. >> >> Bests, >> Dongjoon. >> >> On Fri, Nov 20, 2020 at 9:01 AM Xiao Li wrote: >> >>> https://github.com/apache/spark/pull/28026 is the major feature I am >>> tracking. It is painful to keep two sets of CREATE TABLE DDLs with >>> different behaviors. This hurts the usability of our SQL users, based on >>> what I heard. Unfortunately, this PR missed Spark 3.0 release. Now, I think >>> we should try our best to address it in 3.1. >>> >>> Thanks, >>> >>> Xiao >>> >>> Xiao Li 于2020年11月20日周五 上午8:52写道: >>> Hi, Dongjoon, Thank you for your feedback. I think *Early December* does not mean we will cut the branch on Dec 1st. I do not think Dec 1st and Dec 4th are a big deal. Normally, it would be nice to give enough buffer. Based on my understanding, this email is just a *proposal* and a *reminder*. In the past, we often got mixed feedbacks. Anyway, we are collecting the feedbacks from the whole community. Welcome the inputs from everyone else Thanks, Xiao Dongjoon Hyun 于2020年11月20日周五 上午8:33写道: > Hi, Xiao. > > I agree. > > > Merging the feature work after the branch cut should not be > encouraged in general, although some committers did make some exceptions > based on their own judgement. We should try to avoid merging the feature > work after the branch cut. > > So, the Apache Spark community accepted your request for delay > already. (Early November to Early December) > > - > https://github.com/apache/spark-website/commit/0cd0bdc80503882b4737db7e77cc8f9d17ec12ca > > I don't think the branch cut should be delayed again. We don't need to > have two weeks after Hyukjin's email. > > Given the delay, I'd strongly recommend to cut the branch on 1st > December. > > I'll create a `branch-3.1` on 1st December if Hyujkjin is busy to > start to stabilize . > > Again, it will not block you if you have an exceptional request. > > However, it would be helpful for all of us if you make it clear what > features you are waiting for now. > > We are creating Apache Spark together. > > Bests, > Dongjoon. > > > On Thu, Nov 19, 2020 at 11:38 PM Xiao Li wrote: > >> Correction: >> >> Merging the feature work after the branch cut should not be >> encouraged in general, although some committers did make some exceptions >> based on their own judgement. We should try to avoid merging the feature >> work after the branch cut. >> >> This email is a good reminder message. At least, we have two weeks >> ahead of the proposed branch cut date. I hope each feature owner might >> hurry up and try to finish it before the branch cut. >> >> Xiao >> >> Xiao Li 于2020年11月19日周四 下午11:36写道: >> >>> We should try to merge the feature work after the branch cut. This >>> should not be encouraged in general, although some committers did make >>> some >>> exceptions based on their own judgement. >>> >>> This email is a good reminder message. At least, we have two weeks >>> ahead of the proposed branch cut date. I hope each feature owner might >>> hurry up and try to finish it before the branch cut. >>> >>> Xiao >>> >>> Dongjoon Hyun 于2020年11月19日周四 下午4:02写道: >>> Thank you for your volunteering! Since the previous branch-cuts were always soft-code freeze which allowed committers to merge to the new branches still for a while, I believe 1st December will be better for stabilization. Bests, Dongjoon. On Thu, Nov 19, 2020 at 3:50 PM Hyukjin Kwon wrote: > Hi all, > > I think we haven’t decided yet the exact branch-cut, code freeze > and release manager. > > As we planned in https://spark.apache.org/versioning-policy.html > > Early Dec 2020 Code freeze. Release branch cut > > Code freeze and branch cutting is coming. > > Therefore, we should finish if there are any remaining works for > Spark 3.1, and > switch to QA mode soon. > I think it’s time to set to keep it on track, and I would like to > volunteer to help drive this process. > > I am currently thinking 4th Dec as the branch-cut date. > > Any thoughts? >
Re: Spark 3.1 branch cut 4th Dec?
Thank you, Ryan! Xiao Dongjoon Hyun 于2020年11月20日周五 上午9:20写道: > It sounds great! :) > > Thanks, Ryan. > > On Fri, Nov 20, 2020 at 9:19 AM Ryan Blue wrote: > >> I think we should be able to get the CREATE TABLE changes in. Now that >> the main blocker (EXTERNAL) has been decided, it's just a matter of normal >> review comments. >> >> On Fri, Nov 20, 2020 at 9:05 AM Dongjoon Hyun >> wrote: >> >>> Thank you for sharing, Xiao. >>> >>> I hope we are able to make some agreement for CREATE TABLE DDLs, too. >>> >>> Bests, >>> Dongjoon. >>> >>> On Fri, Nov 20, 2020 at 9:01 AM Xiao Li wrote: >>> https://github.com/apache/spark/pull/28026 is the major feature I am tracking. It is painful to keep two sets of CREATE TABLE DDLs with different behaviors. This hurts the usability of our SQL users, based on what I heard. Unfortunately, this PR missed Spark 3.0 release. Now, I think we should try our best to address it in 3.1. Thanks, Xiao Xiao Li 于2020年11月20日周五 上午8:52写道: > Hi, Dongjoon, > > Thank you for your feedback. I think *Early December* does not mean > we will cut the branch on Dec 1st. I do not think Dec 1st and Dec 4th are > a > big deal. Normally, it would be nice to give enough buffer. Based on my > understanding, this email is just a *proposal* and a *reminder*. In > the past, we often got mixed feedbacks. > > Anyway, we are collecting the feedbacks from the whole community. > Welcome the inputs from everyone else > > Thanks, > > Xiao > > Dongjoon Hyun 于2020年11月20日周五 上午8:33写道: > >> Hi, Xiao. >> >> I agree. >> >> > Merging the feature work after the branch cut should not be >> encouraged in general, although some committers did make some exceptions >> based on their own judgement. We should try to avoid merging the feature >> work after the branch cut. >> >> So, the Apache Spark community accepted your request for delay >> already. (Early November to Early December) >> >> - >> https://github.com/apache/spark-website/commit/0cd0bdc80503882b4737db7e77cc8f9d17ec12ca >> >> I don't think the branch cut should be delayed again. We don't need >> to have two weeks after Hyukjin's email. >> >> Given the delay, I'd strongly recommend to cut the branch on 1st >> December. >> >> I'll create a `branch-3.1` on 1st December if Hyujkjin is busy to >> start to stabilize . >> >> Again, it will not block you if you have an exceptional request. >> >> However, it would be helpful for all of us if you make it clear what >> features you are waiting for now. >> >> We are creating Apache Spark together. >> >> Bests, >> Dongjoon. >> >> >> On Thu, Nov 19, 2020 at 11:38 PM Xiao Li >> wrote: >> >>> Correction: >>> >>> Merging the feature work after the branch cut should not be >>> encouraged in general, although some committers did make some exceptions >>> based on their own judgement. We should try to avoid merging the feature >>> work after the branch cut. >>> >>> This email is a good reminder message. At least, we have two weeks >>> ahead of the proposed branch cut date. I hope each feature owner might >>> hurry up and try to finish it before the branch cut. >>> >>> Xiao >>> >>> Xiao Li 于2020年11月19日周四 下午11:36写道: >>> We should try to merge the feature work after the branch cut. This should not be encouraged in general, although some committers did make some exceptions based on their own judgement. This email is a good reminder message. At least, we have two weeks ahead of the proposed branch cut date. I hope each feature owner might hurry up and try to finish it before the branch cut. Xiao Dongjoon Hyun 于2020年11月19日周四 下午4:02写道: > Thank you for your volunteering! > > Since the previous branch-cuts were always soft-code freeze which > allowed committers to merge to the new branches still for a while, I > believe 1st December will be better for stabilization. > > Bests, > Dongjoon. > > > On Thu, Nov 19, 2020 at 3:50 PM Hyukjin Kwon > wrote: > >> Hi all, >> >> I think we haven’t decided yet the exact branch-cut, code freeze >> and release manager. >> >> As we planned in https://spark.apache.org/versioning-policy.html >> >> Early Dec 2020 Code freeze. Release branch cut >> >> Code freeze and branch cutting is coming. >> >> Therefore, we should finish if there are any remaining works for >> Spark 3.1, and >> switch to QA mode soon. >> I think it’s tim
[SS] full outer stream-stream join
Hi, Stream-stream join in spark structured streaming right now supports INNER, LEFT OUTER, RIGHT OUTER and LEFT SEMI join type. But it does not support FULL OUTER join and we are working on to add it in https://github.com/apache/spark/pull/30395 . Given LEFT OUTER and RIGHT OUTER stream-stream join is supported, the code needed for FULL OUTER join is actually quite straightforward: * For left side input row, check if there's a match on right side state store. if there's a match, output the joined row, o.w. output nothing. Put the row in left side state store. * For right side input row, check if there's a match on left side state store. if there's a match, output the joined row, o.w. output nothing. Put the row in right side state store. * State store eviction: evict rows from left/right side state store below watermark, and output rows never matched before (a combination of left outer and right outer join). Given FULL OUTER join consumes same amount of space in state store, compared with INNER/LEFT OUTER/RIGH OUTER join, and pretty easy to add. I don’t see any issues from system perspective that FULL OUTER join should not be added. I am wondering is there any major blocker to add FULL OUTER stream-stream join? Asking in dev mailing list in case we miss anything besides PR review participation, thanks. Cheng Su -- Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/ - To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
Re: Spark 3.1 branch cut 4th Dec?
Just for the record, I'll stick to the date we documented at https://spark.apache.org/versioning-policy.html Should be best to stick to what we wrote there given they we delayed once already. On Sat, 21 Nov 2020, 02:28 Xiao Li, wrote: > Thank you, Ryan! > > Xiao > > Dongjoon Hyun 于2020年11月20日周五 上午9:20写道: > >> It sounds great! :) >> >> Thanks, Ryan. >> >> On Fri, Nov 20, 2020 at 9:19 AM Ryan Blue wrote: >> >>> I think we should be able to get the CREATE TABLE changes in. Now that >>> the main blocker (EXTERNAL) has been decided, it's just a matter of normal >>> review comments. >>> >>> On Fri, Nov 20, 2020 at 9:05 AM Dongjoon Hyun >>> wrote: >>> Thank you for sharing, Xiao. I hope we are able to make some agreement for CREATE TABLE DDLs, too. Bests, Dongjoon. On Fri, Nov 20, 2020 at 9:01 AM Xiao Li wrote: > https://github.com/apache/spark/pull/28026 is the major feature I am > tracking. It is painful to keep two sets of CREATE TABLE DDLs with > different behaviors. This hurts the usability of our SQL users, based on > what I heard. Unfortunately, this PR missed Spark 3.0 release. Now, I > think > we should try our best to address it in 3.1. > > Thanks, > > Xiao > > Xiao Li 于2020年11月20日周五 上午8:52写道: > >> Hi, Dongjoon, >> >> Thank you for your feedback. I think *Early December* does not mean >> we will cut the branch on Dec 1st. I do not think Dec 1st and Dec 4th >> are a >> big deal. Normally, it would be nice to give enough buffer. Based on my >> understanding, this email is just a *proposal* and a *reminder*. In >> the past, we often got mixed feedbacks. >> >> Anyway, we are collecting the feedbacks from the whole community. >> Welcome the inputs from everyone else >> >> Thanks, >> >> Xiao >> >> Dongjoon Hyun 于2020年11月20日周五 上午8:33写道: >> >>> Hi, Xiao. >>> >>> I agree. >>> >>> > Merging the feature work after the branch cut should not be >>> encouraged in general, although some committers did make some exceptions >>> based on their own judgement. We should try to avoid merging the feature >>> work after the branch cut. >>> >>> So, the Apache Spark community accepted your request for delay >>> already. (Early November to Early December) >>> >>> - >>> https://github.com/apache/spark-website/commit/0cd0bdc80503882b4737db7e77cc8f9d17ec12ca >>> >>> I don't think the branch cut should be delayed again. We don't need >>> to have two weeks after Hyukjin's email. >>> >>> Given the delay, I'd strongly recommend to cut the branch on 1st >>> December. >>> >>> I'll create a `branch-3.1` on 1st December if Hyujkjin is busy to >>> start to stabilize . >>> >>> Again, it will not block you if you have an exceptional request. >>> >>> However, it would be helpful for all of us if you make it clear what >>> features you are waiting for now. >>> >>> We are creating Apache Spark together. >>> >>> Bests, >>> Dongjoon. >>> >>> >>> On Thu, Nov 19, 2020 at 11:38 PM Xiao Li >>> wrote: >>> Correction: Merging the feature work after the branch cut should not be encouraged in general, although some committers did make some exceptions based on their own judgement. We should try to avoid merging the feature work after the branch cut. This email is a good reminder message. At least, we have two weeks ahead of the proposed branch cut date. I hope each feature owner might hurry up and try to finish it before the branch cut. Xiao Xiao Li 于2020年11月19日周四 下午11:36写道: > We should try to merge the feature work after the branch cut. This > should not be encouraged in general, although some committers did > make some > exceptions based on their own judgement. > > This email is a good reminder message. At least, we have two weeks > ahead of the proposed branch cut date. I hope each feature owner might > hurry up and try to finish it before the branch cut. > > Xiao > > Dongjoon Hyun 于2020年11月19日周四 下午4:02写道: > >> Thank you for your volunteering! >> >> Since the previous branch-cuts were always soft-code freeze which >> allowed committers to merge to the new branches still for a while, I >> believe 1st December will be better for stabilization. >> >> Bests, >> Dongjoon. >> >> >> On Thu, Nov 19, 2020 at 3:50 PM Hyukjin Kwon >> wrote: >> >>> Hi all, >>> >>> I think we haven’t decided yet the exact branch-cut, code freeze >>> and r