Re: [DISCUSS] SPIP: Real-Time Mode in Apache Spark Structured Streaming

2025-05-29 Thread Xiao Li
+1 Yuming Wang 于2025年5月29日周四 02:22写道: > +1. > > On Thu, May 29, 2025 at 3:36 PM DB Tsai wrote: > >> +1 >> Sent from my iPhone >> >> On May 29, 2025, at 12:15 AM, John Zhuge wrote: >> >>  >> +1 Nice feature >> >> On Wed, May 28, 2025 at 9:53 PM Yuanjian Li >> wrote: >> >>> +1 >>> >>> Kent Yao

Re: [DISCUSS] SPIP: Real-Time Mode in Apache Spark Structured Streaming

2025-05-29 Thread Jerry Peng
Hi all, A big thanks to everyone that provided feedback to the SPIP! My co-authors and I really appreciate it. I am excited to see this amount of interest in the proposal. I am also glad to see all the support this initiative is getting from the community. Let me summarize some of the common qu

Re: [DISCUSS] SPIP: Real-Time Mode in Apache Spark Structured Streaming

2025-05-29 Thread Yang Jie
+1 On 2025/05/29 16:25:19 Xiao Li wrote: > +1 > > Yuming Wang 于2025年5月29日周四 02:22写道: > > > +1. > > > > On Thu, May 29, 2025 at 3:36 PM DB Tsai wrote: > > > >> +1 > >> Sent from my iPhone > >> > >> On May 29, 2025, at 12:15 AM, John Zhuge wrote: > >> > >>  > >> +1 Nice feature > >> > >> On We

Re: [PSA] GitHub Actions for releasing Apache Spark

2025-05-29 Thread Dongjoon Hyun
Thank you, Hyujjin! It's really great. Dongjoon. On Thu, May 29, 2025 at 00:41 Hyukjin Kwon wrote: > Yup I'll write it down 👍 > > On Thu, May 29, 2025 at 4:39 PM Jungtaek Lim > wrote: > >> Awesome work! I also can't see the screenshots. Maybe we'd need to have >> this in the release guide pag

Re: [PSA] GitHub Actions for releasing Apache Spark

2025-05-29 Thread Xinrong Meng
That will significantly reduce release time and make the process less error-prone. Thank you Hyukjin! On Thu, May 29, 2025 at 2:07 PM Dongjoon Hyun wrote: > Thank you, Hyujjin! > > It's really great. > > Dongjoon. > > On Thu, May 29, 2025 at 00:41 Hyukjin Kwon wrote: > >> Yup I'll write it dow

Re: [DISCUSS] SPIP: Real-Time Mode in Apache Spark Structured Streaming

2025-05-29 Thread John Zhuge
+1 Nice feature On Wed, May 28, 2025 at 9:53 PM Yuanjian Li wrote: > +1 > > Kent Yao 于2025年5月28日周三 19:31写道: > >> +1, LGTM. >> >> Kent >> >> 在 2025年5月29日星期四,Chao Sun 写道: >> >>> +1. Super excited by this initiative! >>> >>> On Wed, May 28, 2025 at 1:54 PM Yanbo Liang wrote: >>> +1 >>>

Re: [PSA] GitHub Actions for releasing Apache Spark

2025-05-29 Thread Hyukjin Kwon
I am resending the images again to make sure: 2. [image: Screenshot 2025-05-29 at 3.25.26 PM.png] 3. [image: Screenshot 2025-05-29 at 3.25.32 PM.png] On Thu, 29 May 2025 at 15:33, Hyukjin Kwon wrote: > Hi all, > > I would like to share that GitHub Actions workflow to release Apache Spark > is

Re: [DISCUSS] SPIP: Real-Time Mode in Apache Spark Structured Streaming

2025-05-29 Thread DB Tsai
+1Sent from my iPhoneOn May 29, 2025, at 12:15 AM, John Zhuge wrote:+1 Nice featureOn Wed, May 28, 2025 at 9:53 PM Yuanjian Li wrote:+1Kent Yao 于2025年5月28日周三 19:31写道:+1, LGTM.Kent在 2025年5月29日星期四,Chao Sun 写道:+1. Super excited by this

Re: [PSA] GitHub Actions for releasing Apache Spark

2025-05-29 Thread Jungtaek Lim
Awesome work! I also can't see the screenshots. Maybe we'd need to have this in the release guide page in Apache Spark website anyway? On Thu, May 29, 2025 at 4:24 PM Yuanjian Li wrote: > Thanks, Hyukjin—this made the release 10 times easier (maybe even more)! > > (Not sure if it’s just me, but

Re: [PSA] GitHub Actions for releasing Apache Spark

2025-05-29 Thread Yuanjian Li
Thanks, Hyukjin—this made the release 10 times easier (maybe even more)! (Not sure if it’s just me, but I can’t see the screenshot you sent.) Hyukjin Kwon 于2025年5月28日周三 23:34写道: > Hi all, > > I would like to share that GitHub Actions workflow to release Apache Spark > is now available. > The wo

Re: [DISCUSS] SPIP: Real-Time Mode in Apache Spark Structured Streaming

2025-05-29 Thread Yuming Wang
+1. On Thu, May 29, 2025 at 3:36 PM DB Tsai wrote: > +1 > Sent from my iPhone > > On May 29, 2025, at 12:15 AM, John Zhuge wrote: > >  > +1 Nice feature > > On Wed, May 28, 2025 at 9:53 PM Yuanjian Li > wrote: > >> +1 >> >> Kent Yao 于2025年5月28日周三 19:31写道: >> >>> +1, LGTM. >>> >>> Kent >>> >>

Re: [PSA] GitHub Actions for releasing Apache Spark

2025-05-29 Thread Hyukjin Kwon
Yup I'll write it down 👍 On Thu, May 29, 2025 at 4:39 PM Jungtaek Lim wrote: > Awesome work! I also can't see the screenshots. Maybe we'd need to have > this in the release guide page in Apache Spark website anyway? > > On Thu, May 29, 2025 at 4:24 PM Yuanjian Li > wrote: > >> Thanks, Hyukjin—t

Re: [DISCUSS] SPIP: Real-Time Mode in Apache Spark Structured Streaming

2025-05-29 Thread Mark Hamstra
It should not be assumed. In something called "real-time", it should be very explicit what clock-time constraints are and are not guaranteed. On Thu, May 29, 2025 at 10:00 PM Jerry Peng wrote: > It was kind of hard to see what mich's point was in the plethora of > emails he sent :) > > In embed

Re: [DISCUSS] SPIP: Real-Time Mode in Apache Spark Structured Streaming

2025-05-29 Thread Jerry Peng
Mark, As an example of my point if you go the the Apache Storm (another stream processing engine) website: https://storm.apache.org/ It describes Storm as: "Apache Storm is a free and open source distributed *realtime* computation system" If you can to apache Flink: https://flink.apache.org/2

Re: [DISCUSS] SPIP: Real-Time Mode in Apache Spark Structured Streaming

2025-05-29 Thread Jerry Peng
Mich, Thank you for chiming in and providing insights into the importance of not only getting correct results but also timely results. You are absolutely right that the reason why something like Real-time Mode is valuable is its ability to provide timely results for certain use cases that require

Re: [DISCUSS] SPIP: Real-Time Mode in Apache Spark Structured Streaming

2025-05-29 Thread Mich Talebzadeh
I think from what I have seen there are a good number of +1 responses as opposed to quantitative discussions (based on my observations only). Given the objectives of the thread, we ought to focus on what is meant by real time compared to continuous modes.To be fair, it is a common point of confus

Re: [DISCUSS] SPIP: Real-Time Mode in Apache Spark Structured Streaming

2025-05-29 Thread Jerry Peng
Mich, If I understood your last email correctly, I think you also wanted to have a discussion about naming? Why are we calling this new execution mode described in the SPIP "Real-time Mode"? Here are my two cents. Firstly, "continuous mode" is taken and we want another name to describe an execu

Re: [DISCUSS] SPIP: Real-Time Mode in Apache Spark Structured Streaming

2025-05-29 Thread Mark Hamstra
Referencing other misuse of "real-time" is not persuasive. A SPIP is an engineering document, not a marketing document. Technical clarity and accuracy should be non-negotiable. On Thu, May 29, 2025 at 10:27 PM Jerry Peng wrote: > Mark, > > As an example of my point if you go the the Apache Stor

Re: [DISCUSS] SPIP: Real-Time Mode in Apache Spark Structured Streaming

2025-05-29 Thread Jerry Peng
Mark, I thought we are simply discussing the naming of the mode? Like I mentioned, if you think simply calling this mode "real-time" mode may cause confusion because "real-time" can mean other things in other fields, I can clarify what we mean by "real-time" explicitly in the SPIP document and an

[DISCUSS][MINOR] Fix broken link in spark-website for SS Programming Guide

2025-05-29 Thread Anish Shrigondekar
Hi, We have a broken link for the latest docs for the 4.0 release. This page: https://spark.apache.org/docs/latest/structured-streaming-programming-guide.html has a hyperlink that points to the contents of the Structured Streaming guide. But it seems this link is broken and points back to the mai

Re: [DISCUSS] SPIP: Real-Time Mode in Apache Spark Structured Streaming

2025-05-29 Thread Mark Hamstra
Clarifying what is meant by "real-time" and explicitly differentiating it from actual real-time computing should be a bare minimum. I still don't like the use of marketing-speak "real-time" that isn't really real-time in engineering documents or API namespaces. On Thu, May 29, 2025 at 10:43 PM Jer

Re: [DISCUSS] SPIP: Real-Time Mode in Apache Spark Structured Streaming

2025-05-29 Thread Jerry Peng
Mark, For real-time systems there is a concept of "soft" real-time and "hard" real-time systems. These concepts exist in textbooks. Here is a document by intel that explains it: https://www.intel.com/content/www/us/en/learn/what-is-a-real-time-system.html "In a soft real-time system, computers