Re: relative path in DataFrameWriter and DataStreamWriter

2025-01-16 Thread Jungtaek Lim
Your best bet is to make relative path in driver to be resolved to absolute path and pass over to executor with that resolved path. This needs some discussion whether we want to do that, but this is at least technically correct. On Fri, Jan 17, 2025 at 1:54 PM Jungtaek Lim wrote: > Examples are

Re: relative path in DataFrameWriter and DataStreamWriter

2025-01-16 Thread Jungtaek Lim
Examples are assuming you are running them in the single node cluster. If you feel like it's causing confusion, this is something we need to fix, e.g. put disclaimer that the example is based on the assumption it is running with a single node cluster. >> > More problematic thing is to use the loca

Re: [DISCUSS] Ongoing projects for Spark 4.0

2025-01-16 Thread Anish Shrigondekar
Hi, We are working on the new arbitrary state API support for streaming called transformWithState - (epic here - https://issues.apache.org/jira/browse/SPARK-46815). We have few PRs left and we would like to get these into Spark 4.0. However, if we can't get them in within the deadline, we will tar

Re: A documentation change is a user-facing change

2025-01-16 Thread Wenchen Fan
+1 to update the PR template. I think the intent is to ask PR authors to call out all the user-facing changes that need attention from the end users, such as new features and behavior changes, but doc change is clearly not one of them. On Fri, Jan 17, 2025 at 7:10 AM Gengliang Wang wrote: > Than

Re: A documentation change is a user-facing change

2025-01-16 Thread Gengliang Wang
Thanks for pointing it out! Based on the discussion, I’ve created a PR: https://github.com/apache/spark/pull/49534. Let me know what you think! On Thu, Jan 16, 2025 at 2:25 PM Xiao Li wrote: > Thank you for pointing it out! Let’s update the template to exclude > documentation changes from the b

Re: A documentation change is a user-facing change

2025-01-16 Thread Xiao Li
Thank you for pointing it out! Let’s update the template to exclude documentation changes from the behavior change question. At the same time, I strongly believe that all behavior changes should be clearly documented in the PR description. For instance, in the first example PR https://github.com/a

Re: A documentation change is a user-facing change

2025-01-16 Thread Reynold Xin
Seems like we should fix the template if that's not the intent On Thu, Jan 16, 2025 at 1:52 PM Nicholas Chammas wrote: > The template says "including all aspects such as the documentation fix >

Re: A documentation change is a user-facing change

2025-01-16 Thread Nicholas Chammas
The template says "including all aspects such as the documentation fix .” The original intent may well have been about behavior changes only, but that’s not reflected in the curre

Re: relative path in DataFrameWriter and DataStreamWriter

2025-01-16 Thread Rozov, Vlad
> More problematic thing is to use the local filesystem for the path which is > interpreted by distributed machines. It depends. Nowadays distributed systems mostly use cloud (S3, GFS, etc) or HDFS, but NFS and other locally mounted FS can still be in use and should be supported. > this actua

Re: A documentation change is a user-facing change

2025-01-16 Thread Dongjoon Hyun
The original intent is a user-facing *behavior* change technically which is the same with Apache Spark migration guide. If so, does it make sense to you? Probably, since the template was short to be concise, it could be interpreted in more ways than we thought. Dongjoon. On Thu, Jan 16, 2025 at

Re: A documentation change is a user-facing change

2025-01-16 Thread Nicholas Chammas
I didn’t write the pull request template and I am not sure about the author’s original intent. However, the plain meaning of “any” — with the added emphasis as well — suggests t

Re: A documentation change is a user-facing change

2025-01-16 Thread Dongjoon Hyun
I understand your concern, Nicholas. However, isn't it too strict? For the above example, adding a new HTML page is a user-facing change. https://github.com/apache/spark/pull/48852 (This is a new doc) [SPARK-50309][DOCS] Document SQL Pipe Syntax https://github.com/apache/spark/pull/49098 (This i

A documentation change is a user-facing change

2025-01-16 Thread Nicholas Chammas
This is not a big deal at all, but I figure it’s worth bringing up briefly because the pull request template does emphasize : > ### Does this PR introduce _any_ user-facing change

RE: [DISCUSS] Ongoing projects for Spark 4.0

2025-01-16 Thread Akhil Gudesa
Hi all, I am working on adding an example project that demonstrates the concept of "Spark Server Libraries" in Spark Connect ( https://issues.apache.org/jira/browse/SPARK-50848), there will be a few PRs in the upcoming days and I hope to have it included in Spark 4.0 Thanks! Akhil On 2025/01/15

RE: [DISCUSS] Ongoing projects for Spark 4.0

2025-01-16 Thread Paddy Xu
I am working on a project https://issues.apache.org/jira/browse/SPARK-48918 which targets Spark 4.0. Some items on the list would need to go into this new branch. Now three PRs are ongoing: https://github.com/apache/spark/pull/49373 https://github.com/apache/spark/pull/49339 https://github.com/a