Just FWIW, Spark remains as Spark. We just refer to "Spark without Spark Connect" in the documentation as "Spark Classic" for clarification. I think it won't be excessively used.
On Thu, 25 Jul 2024 at 03:59, Holden Karau <holden.ka...@gmail.com> wrote: > I'm concerned about the term "Classic" bringing a negative connotation to > it. > > On Mon, Jul 22, 2024 at 5:11 PM Hyukjin Kwon <gurwls...@apache.org> wrote: > >> Yeah that's what I intended. Thanks for clarification. >> >> Let me start the vote >> >> >> On Tue, 23 Jul 2024 at 08:14, Sadha Chilukoori <sage.quoti...@gmail.com> >> wrote: >> >>> Hi Dongjoon, >>> >>> *To be clear, is the proposal aiming to make us to say like A instead of >>> B in our documentation?* >>> >>> *A. Since `Spark Connect` mode has no RDD API, we need to use `Spark >>> Classic` mode instead.* >>> *B. Since `Spark Connect` mode has no RDD API, we need to use `Spark >>> without Spark Connect` mode instead*. >>> >>> >>> Correct, the thread is recommending to use option A, consistently in all >>> the documentation. >>> >>> -Sadha >>> >>> On Mon, Jul 22, 2024, 10:25 AM Dongjoon Hyun <dongj...@apache.org> >>> wrote: >>> >>>> Thank you for opening this thread, Hyukjin. >>>> >>>> In this discussion thread, we have three terminologies, (1) ~ (3). >>>> >>>> > Spark Classic (vs. Spark Connect) >>>> >>>> 1. Spark >>>> 2. Spark Classic (= A proposal for Spark without Spark Connect) >>>> 3. Spark Connect >>>> >>>> As Holden and Jungtaek mentioned, >>>> >>>> - (1) is definitely the existing code base which includes all >>>> (including RDD API, Spark Thrift Server, Spark Connect and so on). >>>> >>>> - (3) is is a very specific use case to a user when a Spark binary >>>> distribution is used with `--remote` option (or enabling the related >>>> features). Like Spark Thrift Server, after query planning steps, there is >>>> no fundamental difference in the execution code side in Spark clusters or >>>> Spark jobs. >>>> >>>> - (2) By the proposed definition, (2) `Spark Classic` is not (1) >>>> `Spark`. Like `--remote`, it's one of runnable modes. >>>> >>>> To be clear, is the proposal aiming to make us to say like A instead of >>>> B in our documentation? >>>> >>>> A. Since `Spark Connect` mode has no RDD API, we need to use `Spark >>>> Classic` mode instead. >>>> B. Since `Spark Connect` mode has no RDD API, we need to use `Spark >>>> without Spark Connect` mode instead. >>>> >>>> Dongjoon. >>>> >>>> >>>> >>>> On 2024/07/22 12:59:54 Sadha Chilukoori wrote: >>>> > +1 (non-binding) for classic. >>>> > >>>> > On Mon, Jul 22, 2024 at 3:59 AM Martin Grund >>>> <mar...@databricks.com.invalid> >>>> > wrote: >>>> > >>>> > > +1 for classic. It's simple, easy to understand and it doesn't have >>>> the >>>> > > negative meanings like legacy for example. >>>> > > >>>> > > On Sun, Jul 21, 2024 at 23:48 Wenchen Fan <cloud0...@gmail.com> >>>> wrote: >>>> > > >>>> > >> Classic SGTM. >>>> > >> >>>> > >> On Mon, Jul 22, 2024 at 1:12 PM Jungtaek Lim < >>>> > >> kabhwan.opensou...@gmail.com> wrote: >>>> > >> >>>> > >>> I'd propose not to change the name of "Spark Connect" - the name >>>> > >>> represents the characteristic of the mode (separation of layer >>>> for client >>>> > >>> and server). Trying to remove the part of "Connect" would just >>>> make >>>> > >>> confusion. >>>> > >>> >>>> > >>> +1 for Classic to existing mode, till someone comes up with better >>>> > >>> alternatives. >>>> > >>> >>>> > >>> On Mon, Jul 22, 2024 at 8:50 AM Hyukjin Kwon < >>>> gurwls...@apache.org> >>>> > >>> wrote: >>>> > >>> >>>> > >>>> I was thinking about a similar option too but I ended up giving >>>> this up >>>> > >>>> .. It's quite unlikely at this moment but suppose that we have >>>> another >>>> > >>>> Spark Connect-ish component in the far future and it would be >>>> challenging >>>> > >>>> to come up with another name ... Another case is that we might >>>> have to cope >>>> > >>>> with the cases like Spark Connect, vs Spark (with Spark Connect) >>>> and Spark >>>> > >>>> (without Spark Connect) .. >>>> > >>>> >>>> > >>>> On Sun, 21 Jul 2024 at 09:59, Holden Karau < >>>> holden.ka...@gmail.com> >>>> > >>>> wrote: >>>> > >>>> >>>> > >>>>> I think perhaps Spark Connect could be phrased as “Basic* >>>> Spark” & >>>> > >>>>> existing Spark could be “Full Spark” given the API limitations >>>> of Spark >>>> > >>>>> connect. >>>> > >>>>> >>>> > >>>>> *I was also thinking Core here but we’ve used core to refer to >>>> the RDD >>>> > >>>>> APIs for too long to reuse it here. >>>> > >>>>> >>>> > >>>>> Twitter: https://twitter.com/holdenkarau >>>> > >>>>> Books (Learning Spark, High Performance Spark, etc.): >>>> > >>>>> https://amzn.to/2MaRAG9 <https://amzn.to/2MaRAG9> >>>> > >>>>> YouTube Live Streams: https://www.youtube.com/user/holdenkarau >>>> > >>>>> >>>> > >>>>> >>>> > >>>>> On Sat, Jul 20, 2024 at 8:02 PM Xiao Li <gatorsm...@gmail.com> >>>> wrote: >>>> > >>>>> >>>> > >>>>>> Classic is much better than Legacy. : ) >>>> > >>>>>> >>>> > >>>>>> Hyukjin Kwon <gurwls...@apache.org> 于2024年7月18日周四 16:58写道: >>>> > >>>>>> >>>> > >>>>>>> Hi all, >>>> > >>>>>>> >>>> > >>>>>>> I noticed that we need to standardize our terminology before >>>> moving >>>> > >>>>>>> forward. For instance, when documenting, 'Spark without Spark >>>> Connect' is >>>> > >>>>>>> too long and verbose. Additionally, I've observed that we use >>>> various names >>>> > >>>>>>> for Spark without Spark Connect: Spark Classic, Classic >>>> Spark, Legacy >>>> > >>>>>>> Spark, etc. >>>> > >>>>>>> >>>> > >>>>>>> I propose that we consistently refer to it as Spark Classic >>>> (vs. >>>> > >>>>>>> Spark Connect). >>>> > >>>>>>> >>>> > >>>>>>> Please share your thoughts on this. Thanks! >>>> > >>>>>>> >>>> > >>>>>> >>>> > >>>> >>>> --------------------------------------------------------------------- >>>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org >>>> >>>> > > -- > Twitter: https://twitter.com/holdenkarau > Books (Learning Spark, High Performance Spark, etc.): > https://amzn.to/2MaRAG9 <https://amzn.to/2MaRAG9> > YouTube Live Streams: https://www.youtube.com/user/holdenkarau >