Re: [ANNOUNCE] Apache Spark 3.5.3 released

2024-09-25 Thread Haejoon Lee
t; > - https://github.com/apache/spark-docker/pull/64 > (Publish 3.5.2 to docker registry) > > Dongjoon. > > > On Tue, Sep 24, 2024 at 10:29 PM Haejoon Lee > wrote: > >> Hi, Yang! >> >> And thanks Dongjoon for answering the question! >> >> For

Re: [ANNOUNCE] Apache Spark 3.5.3 released

2024-09-24 Thread Haejoon Lee
mmits that does not exists in >> tag/v3.5.3, e.g `[SPARK-49628]` exists in branch-3.5 but not in tag/v3.5.3, >> would you help explain more about that? >> >> Thank you >> >> On 2024/09/25 01:05:47 Haejoon Lee wrote: >> > We are happy to announce the ava

[ANNOUNCE] Apache Spark 3.5.3 released

2024-09-24 Thread Haejoon Lee
possible without you. Haejoon Lee

Re: [VOTE][RESULT] Release Apache Spark 3.5.3 (RC3)

2024-09-12 Thread Haejoon Lee
(8 binding +9s) -> (9 binding +1s) On Fri, Sep 13, 2024 at 1:41 PM Haejoon Lee wrote: > The vote passes with 11 +1s (8 binding +9s). > Thanks to all who helped with the release! > > (* = binding) > +1: > - Hyukjin Kwon * > - Rui Wang > - Wenchen Fan * > -

[VOTE][RESULT] Release Apache Spark 3.5.3 (RC3)

2024-09-12 Thread Haejoon Lee
The vote passes with 11 +1s (8 binding +9s). Thanks to all who helped with the release! (* = binding) +1: - Hyukjin Kwon * - Rui Wang - Wenchen Fan * - Gengliang Wang * - Kent Yao * - Herman van Hovell * - Dongjoon Hyun * - L. C. Hsieh * - Ruifeng Zheng * - Zhou Jiang - Xinrong Meng * +0: None -

[VOTE] Release Apache Spark 3.5.3 (RC3)

2024-09-09 Thread Haejoon Lee
hat has not been correctly targeted please ping me or a committer to help target the issue. Thanks! Haejoon Lee

[VOTE] Release Apache Spark 3.5.3 (RC2)

2024-09-06 Thread Haejoon Lee
which is a regression that has not been correctly targeted please ping me or a committer to help target the issue. Thanks! Haejoon Lee

[VOTE] Release Apache Spark 3.5.3 (RC1)

2024-09-03 Thread Haejoon Lee
hat has not been correctly targeted please ping me or a committer to help target the issue. Thanks! Haejoon Lee

Re: [DISCUSS] release Spark 3.5.3?

2024-09-01 Thread Haejoon Lee
+1, and I'd like to volunteer as the release manager for Apache Spark 3.5.3 if we don't have one yet On Sun, Sep 1, 2024 at 11:23 PM Xiao Li wrote: > +1 > > Yuming Wang 于2024年8月30日周五 02:34写道: > >> +1, Could we include two additional issues: >> https://issues.apache.org/jira/browse/SPARK-49472 >

Re: [外部邮件] Re: Welcome new Apache Spark committers

2024-08-18 Thread Haejoon Lee
Congrats Allison, Martin, and Haejoon! >> >> >> >> On Tue, Aug 13, 2024 at 9:59 AM Jungtaek Lim < >> kabhwan.opensou...@gmail.com> wrote: >> >> Congrats everyone! >> >> >> >> On Tue, Aug 13, 2024 at 9:21 AM Xiao Li wrote: >

Re: [VOTE] SPIP: Pure Python Package in PyPI (Spark Connect)

2024-03-31 Thread Haejoon Lee
+1 On Mon, Apr 1, 2024 at 10:15 AM Hyukjin Kwon wrote: > Hi all, > > I'd like to start the vote for SPIP: Pure Python Package in PyPI (Spark > Connect) > > JIRA > Prototype > SPIP doc >

Re: [VOTE] SPIP: Structured Logging Framework for Apache Spark

2024-03-11 Thread Haejoon Lee
+1 On Mon, Mar 11, 2024 at 10:36 AM Gengliang Wang wrote: > Hi all, > > I'd like to start the vote for SPIP: Structured Logging Framework for > Apache Spark > > References: > >- JIRA ticket >- SPIP doc > >

Re: First Time contribution.

2023-09-17 Thread Haejoon Lee
Welcome Ram! :-) I would recommend you to check https://issues.apache.org/jira/browse/SPARK-37935 out as a starter task. Refer to https://github.com/apache/spark/pull/41504, https://github.com/apache/spark/pull/41455 as an example PR. Or you can also add a new sub-task if you find any error mess

Re: LLM script for error message improvement

2023-08-03 Thread Haejoon Lee
wrote: > >> I think adding that dev tool script to improve the error message is fine. >> >> On Thu, 3 Aug 2023 at 10:24, Haejoon Lee >> wrote: >> >>> Dear contributors, I hope you are doing well! >>> >>> I see there are contributors who

LLM script for error message improvement

2023-08-02 Thread Haejoon Lee
Dear contributors, I hope you are doing well! I see there are contributors who are interested in working on error message improvements and persistent contribution, so I want to share an llm-based error message improvement script for helping your contribution. You can find a detail for the script

Re: [Question] Can't start Spark Connect

2023-03-08 Thread Haejoon Lee
Additionally, try deleting the `.idea` in the spark home directory and restarting IntelliJ if it does not work properly after re-building during development. The .idea stores IntelliJ's project configuration and settings, and is automatically generated when IntelliJ is launched. >

Re: Welcome Xinrong Meng as a Spark committer

2022-08-09 Thread Haejoon Lee
Congrats, Xinrong!! On Tue, Aug 9, 2022 at 5:12 PM Hyukjin Kwon wrote: > Hi all, > > The Spark PMC recently added Xinrong Meng as a committer on the project. > Xinrong is the major contributor of PySpark especially Pandas API on Spark. > She has guided a lot of new contributors enthusiastically.

Question using multiple partition for Window cumulative functions when partition is not specified.

2021-08-29 Thread Haejoon Lee
Hi all, I noticed that Spark uses only one partition when performing Window cumulative functions without specifying the partition, so all the dataset is moved into a single partition which easily causes OOM or serious performance degradation. See the example below: >>> from pyspark.sql import fu