Re: [DISCUSS] SPIP: Spark Connect - A client and server interface for Apache Spark.

2022-06-07 Thread Martin Grund
On Tue, Jun 7, 2022 at 3:54 PM Steve Loughran wrote: > > > On Fri, 3 Jun 2022 at 18:46, Martin Grund > wrote: > >> Hi Everyone, >> >> We would like to start a discussion on the "Spark Connect" proposal. >> Please find the links below: >> >> *JIRA* - https://issues.apache.org/jira/browse/SPARK-39

Re: [DISCUSS] SPIP: Spark Connect - A client and server interface for Apache Spark.

2022-06-07 Thread Steve Loughran
On Fri, 3 Jun 2022 at 18:46, Martin Grund wrote: > Hi Everyone, > > We would like to start a discussion on the "Spark Connect" proposal. > Please find the links below: > > *JIRA* - https://issues.apache.org/jira/browse/SPARK-39375 > *SPIP Document* - > https://docs.google.com/document/d/1Mnl6jmGs

Re: [DISCUSS] SPIP: Spark Connect - A client and server interface for Apache Spark.

2022-06-06 Thread Hyukjin Kwon
What I like most about this SPIP are: 1. We could leverage this SPIP to dispatch the driver to the cluster (e.g., yarn-cluster or K8S cluster mode) with an interactive shell which Spark currently doesn't support. 2. Makes it easier for other languages to support, especially given that we talked abo

Re: [DISCUSS] SPIP: Spark Connect - A client and server interface for Apache Spark.

2022-06-06 Thread Martin Grund
Hi Mich, I think I must have been not clear enough in the document. The proposal is not for connecting Spark to other engines but to connect to Spark from other clients remotely (without using SQL) Please let me know if that clarifies things or if I can provide additional context. Thanks Martin

Re: [DISCUSS] SPIP: Spark Connect - A client and server interface for Apache Spark.

2022-06-05 Thread Mich Talebzadeh
Hi, Whilst I concur that there is a need for client server architecture, that technology has been around over 30 years. Moreover the current spark had vey efficient connections via JDBC to various databases. In some cases the API to various databases, for example Google BiqQuery is very efficient.

Re: [DISCUSS] SPIP: Spark Connect - A client and server interface for Apache Spark.

2022-06-04 Thread Martin Grund
Support for UDFs would work in the same way as they work today. The closures are serialized on the client and sent via the driver to the worker. While there is no difference in the execution of the UDF, there can be potential challenges with the dependencies required for execution. This is true bo

Re: [DISCUSS] SPIP: Spark Connect - A client and server interface for Apache Spark.

2022-06-03 Thread Koert Kuipers
how would scala udfs be supported in this? On Fri, Jun 3, 2022 at 1:52 PM Martin Grund wrote: > Hi Everyone, > > We would like to start a discussion on the "Spark Connect" proposal. > Please find the links below: > > *JIRA* - https://issues.apache.org/jira/browse/SPARK-39375 > *SPIP Document* -

[DISCUSS] SPIP: Spark Connect - A client and server interface for Apache Spark.

2022-06-03 Thread Martin Grund
Hi Everyone, We would like to start a discussion on the "Spark Connect" proposal. Please find the links below: *JIRA* - https://issues.apache.org/jira/browse/SPARK-39375 *SPIP Document* - https://docs.google.com/document/d/1Mnl6jmGszixLW4KcJU5j9IgpG9-UabS0dcM6PM2XGDc/edit#heading=h.wmsrrfealhrj