Re: [DISCUSS] Ongoing projects for Spark 4.0

2025-01-26 Thread Ángel
Hi, I'd also like to include this other one I opened last summer: https://issues.apache.org/jira/plugins/servlet/mobile#issue/SPARK-49288. Regards, Ángel. El lun, 27 ene 2025, 6:17, Wenchen Fan escribió: > Hi all, > > Thanks for sharing the progress of ongoing projects! Let me summarize them

Re: [DISCUSS] Ongoing projects for Spark 4.0

2025-01-26 Thread Dongjoon Hyun
Thank you, Wenchen, for the summarization and management. Dongjoon. On Sun, Jan 26, 2025 at 9:17 PM Wenchen Fan wrote: > Hi all, > > Thanks for sharing the progress of ongoing projects! Let me summarize them > here: > - Add Spark Connect config to allow simple switch [PR >

Re: [DISCUSS] Ongoing projects for Spark 4.0

2025-01-26 Thread Wenchen Fan
Hi all, Thanks for sharing the progress of ongoing projects! Let me summarize them here: - Add Spark Connect config to allow simple switch [PR ] - ML algorithms on Spark Connect (doesn't block 4.0) [JIRA

Re: [DISCUSS] Ongoing projects for Spark 4.0

2025-01-25 Thread Ángel
I've just open the ticket SPARK-50992 Could someone please review it? I'm proposing a new explain mode for converting plans to strings. Currently, explaining plans with AQE enabled is resource-intensive and can lead to memory accumulation in the

Re: [DISCUSS] Ongoing projects for Spark 4.0

2025-01-22 Thread Ángel
Hi, I’m working on a performance issue that ends up throwing an OutOfMemoryError when AQE is enabled. This problem was first identified by Russel Jurney while running GraphFrames unit tests, as detailed in his gist . The issue was a

Re: [DISCUSS] Ongoing projects for Spark 4.0

2025-01-22 Thread Mich Talebzadeh
Interesting points: client server architecture has been around since the days of Sybase. A client written in any language, say Python, Scala makes a request to spark cluster. This remote access model inherently creates a level of isolation between the client application and the internal workings of

Re: [DISCUSS] Ongoing projects for Spark 4.0

2025-01-22 Thread David Milicevic
Hi all, Together with my team, I'm working on adding support for SQL Scripting (JIRA , Ref Spec ). The feature is guarded b

RE: [DISCUSS] Ongoing projects for Spark 4.0

2025-01-22 Thread Stefan Kandic
Hi, I am working on adding collation support (https://issues.apache.org/jira/projects/SPARK/issues/SPARK-46830). Right now, collations are enabled by default as we have finished almost everything we planned to add. However, there are still some smaller things and improvements left that have on

RE: [DISCUSS] Ongoing projects for Spark 4.0

2025-01-22 Thread Milan Cupac
I am working on recursive CTEs. Two final PRs should be merged soon: https://github.com/apache/spark/pull/49518 https://github.com/apache/spark/pull/49571 2025/01/15 13:41:07 Wenchen Fan wrote: > Hi all, > > We have cut the "branch-4.0" and I'm sending this email to collect the > information

Re: [DISCUSS] Ongoing projects for Spark 4.0

2025-01-16 Thread Anish Shrigondekar
Hi, We are working on the new arbitrary state API support for streaming called transformWithState - (epic here - https://issues.apache.org/jira/browse/SPARK-46815). We have few PRs left and we would like to get these into Spark 4.0. However, if we can't get them in within the deadline, we will tar

RE: [DISCUSS] Ongoing projects for Spark 4.0

2025-01-16 Thread Akhil Gudesa
Hi all, I am working on adding an example project that demonstrates the concept of "Spark Server Libraries" in Spark Connect ( https://issues.apache.org/jira/browse/SPARK-50848), there will be a few PRs in the upcoming days and I hope to have it included in Spark 4.0 Thanks! Akhil On 2025/01/15

RE: [DISCUSS] Ongoing projects for Spark 4.0

2025-01-16 Thread Paddy Xu
I am working on a project https://issues.apache.org/jira/browse/SPARK-48918 which targets Spark 4.0. Some items on the list would need to go into this new branch. Now three PRs are ongoing: https://github.com/apache/spark/pull/49373 https://github.com/apache/spark/pull/49339 https://github.com/a

Re: [DISCUSS] Ongoing projects for Spark 4.0

2025-01-15 Thread Dongjoon Hyun
Although it's not a project, I just want to give a heads-up for the following stuff: [SPARK-50807][BUILD] Upgrade Scala to 2.13.16 https://github.com/apache/spark/pull/49478 It depends on the 3rd party library release schedule for now. Although I hope we can ship it with Spark 4.0.0, Spark 4.1.0

Re: [DISCUSS] Ongoing projects for Spark 4.0

2025-01-15 Thread Holden Karau
I think I’ll need to go in and mark GraphX as deprecated. But that can happen post branch cut. Twitter: https://twitter.com/holdenkarau Fight Health Insurance: https://www.fighthealthinsurance.com/ Books (Learning Spark, High Performance Spark, et

Re: [DISCUSS] Ongoing projects for Spark 4.0

2025-01-15 Thread Amanda Liu
Hi all, I am working on Describe Table as JSON (SPARK-50541 ). The major support is complete but a few more PRs are left, such as adding support for v2 tables. Thanks! Best, A

Re: [DISCUSS] Ongoing projects for Spark 4.0

2025-01-15 Thread Allison Wang
I am working on SQL user-defined functions (SPARK-46057 ). There are a few major PRs left, and I’d like to have this feature included in Spark 4.0. On Wed, Jan 15, 2025 at 5:27 PM Bobby wrote: > I also have one: https://github.com/apache/spark/p

Re: [DISCUSS] Ongoing projects for Spark 4.0

2025-01-15 Thread Takuya UESHIN
https://github.com/apache/spark/pull/49424 should be in Spark 4, too. Its classic support is already done, so it should support Spark Connect. Thanks. On Wed, Jan 15, 2025 at 5:27 PM Bobby wrote: > I also have one: https://github.com/apache/spark/pull/49503 > > I would like to support plugin fo

Re: [DISCUSS] Ongoing projects for Spark 4.0

2025-01-15 Thread Bobby
I also have one: https://github.com/apache/spark/pull/49503 I would like to support plugin for connect ML in 4.0 Thx Ruifeng Zheng 于2025年1月16日周四 09:10写道: > This one: SPARK-50812 > We want to support more ML algorithms on Connect in 4.0. > But

Re: [DISCUSS] Ongoing projects for Spark 4.0

2025-01-15 Thread Ruifeng Zheng
This one: SPARK-50812 We want to support more ML algorithms on Connect in 4.0. But we don't have to support all in 4.0, after the code freeze we will continue work on it for 4.1. On Thu, Jan 16, 2025 at 7:50 AM Hyukjin Kwon wrote: > I have one

Re: [DISCUSS] Ongoing projects for Spark 4.0

2025-01-15 Thread Hyukjin Kwon
I have one PR: https://github.com/apache/spark/pull/49107 On Wed, 15 Jan 2025 at 22:41, Wenchen Fan wrote: > Hi all, > > We have cut the "branch-4.0" and I'm sending this email to collect the > information for ongoing projects targeting Spark 4.0. Please reply to this > email to share the projec