[DISCUSS][Spark SQL] Update API

2024-09-23 Thread Szehon Ho
Hi all, In https://github.com/apache/spark/pull/47233, we are looking to add a Spark DataFrame API for functional equivalence to Spark SQL's UPDATE statement. There are open discussions on the PR about location/format of the API, and we wanted to ask on devlist to get more opinions. One consider

Re: Re: [Discuss] SPIP: Support NanoSecond Timestamps

2025-03-14 Thread Szehon Ho
+1 to the idea as well, as Iceberg V3 is coming with time with nanos, and Spark would not be able to read this type without this. Thanks Szehon On Fri, Mar 14, 2025 at 3:34 PM Wenchen Fan wrote: > In general, I think it's good for Spark to support the common data types > in the ecosystem, as it

Re: [VOTE] SPIP: Constraints in DSv2

2025-04-05 Thread Szehon Ho
+1 (non binding) Agree with Anton, data sources like the open table formats define the requirement, and definitely need engines to write to it accordingly. Thanks, Szehon On Fri, Mar 21, 2025 at 1:31 PM Anton Okolnychyi wrote: > -1 (non-binding): Breaks the Chain of Responsibility. Constraints

Re: [VOTE] SPIP: Support NanoSecond Timestamps

2025-04-05 Thread Szehon Ho
Trying to catch up on this, Serge's suggestion in the doc seems the best way forward, https://docs.google.com/document/d/1wjFsBdlV2YK75x7UOk2HhDOqWVA0yC7iEiqOMnNnxlA/edit?disco=AAABe5AUnWU. Spark would support the full ANSI SQL timestamp range, and Iceberg / Parquet/ other data source will throw ru

Re: [VOTE] SPIP: Declarative Pipelines

2025-04-09 Thread Szehon Ho
+1 (non-binding) Thanks Szehon On Wed, Apr 9, 2025 at 3:42 PM Hyukjin Kwon wrote: > I will shephard. > > On Thu, 10 Apr 2025 at 07:28, Anton Okolnychyi > wrote: > >> +1 (non-binding) >> >> - Anton >> >> ср, 9 квіт. 2025 р. о 15:01 Jungtaek Lim >> пише: >> >>> Btw who is going to shephard this

Re: [DISCUSS] SPIP: Declarative Pipelines

2025-04-09 Thread Szehon Ho
+1 really excited to finally see Materialized View finally make its way to Spark, as many other ecosystem projects (Trino, Starrocks, soon Iceberg) already supporting it. Thanks Szehon On Wed, Apr 9, 2025 at 2:33 AM Martin Grund wrote: > +1 > > On Wed, Apr 9, 2025 at 9:37 AM Mich Talebzadeh >

Re: [DISCUSS] SPIP: Add geospatial types to Spark

2025-03-30 Thread Szehon Ho
tutorial/files/geoparquet-sedona-spark/ >>> >>> [4] Sedona-Iceberg connector (PoC): >>> https://github.com/wherobots/sedona-iceberg-connector >>> >>> [5] Spark-Sedona-Iceberg working example: >>> https://github.com/wherobots/sedona-iceberg-conn

Re: [VOTE] Release Spark 4.0.0 (RC4)

2025-04-23 Thread Szehon Ho
One more small fix (on another topic) for the next RC: https://github.com/apache/spark/pull/50685 Thanks! Szehon On Tue, Apr 22, 2025 at 10:07 AM Rozov, Vlad wrote: > Correct, to me it looks like a Spark bug > https://issues.apache.org/jira/browse/SPARK-51821 that may be hard to > trigger and i

Re: [VOTE] SPIP: Add geospatial types to Spark

2025-05-05 Thread Szehon Ho
+1 (non binding) Thanks Szehon On Mon, May 5, 2025 at 11:17 AM DB Tsai wrote: > +1, geospatial types will be a great feature for Spark. Thanks for working > on it. > > On May 5, 2025, at 11:04 AM, Menelaos Karavelas < > menelaos.karave...@gmail.com> wrote: > > I started the discussion on addin

Re: 4.0.0 RC1 is coming

2025-02-21 Thread Szehon Ho
Hi Sorry for late reply, we identified another serious issue with the newly added Call Procedure, can we add it to the list? SPARK-51273: Spark Connect Call Procedure runs the procedure twice . I have a PR

Re: [DISCUSS] SPIP: Add geospatial types to Spark

2025-03-28 Thread Szehon Ho
Thanks Menelaos, this is exciting ! Is there a google doc we can comment, or just on the JIRA? Thanks Szehon On Fri, Mar 28, 2025 at 1:41 PM Ángel Álvarez Pascua < angel.alvarez.pas...@gmail.com> wrote: > Sorry, I only had a quick look at the proposal, looked for WKT and didn't > find anything.

Re: [DISCUSS] SPIP: Add geospatial types to Spark

2025-03-29 Thread Szehon Ho
just created a Google doc and also linked it in the JIRA:SPIP: Add geospatial types in Sparkdocs.google.comPlease feel free to comment on it.Best,MenelaosOn Mar 28, 2025, at 2:19 PM, Szehon Ho wrote:Thanks Menelaos, this is exciting !  Is there a google doc we can comment, or just on the

Re: [VOTE] Release Spark 4.0.0 (RC5)

2025-05-12 Thread Szehon Ho
+1 (non binding) Checked license, signature, checksum, ran basic test on spark-4.0.0-bin-hadoop3. Thanks Szehon On Mon, May 12, 2025 at 9:02 PM Sakthi wrote: > +1 (non-binding) > > On Mon, May 12, 2025 at 7:38 PM Jungtaek Lim > wrote: > >> +1 (non-binding) >> >> Thanks Wenchen for driving the

Re: [VOTE] Release Spark 4.0.0 (RC6)

2025-05-13 Thread Szehon Ho
+1 (non-binding) Checked signature, checksum, basic test on spark-4.0.0-bin-hadoop3 Thanks Szehon On Tue, May 13, 2025 at 8:44 PM Yang Jie wrote: > +1 > > On 2025/05/14 00:21:11 Ruifeng Zheng wrote: > > +1 > > > > On Wed, May 14, 2025 at 7:01 AM Gengliang Wang wrote: > > > > > +1 > > > > > >

Re: [VOTE] Release Spark 4.0.0 (RC7)

2025-05-19 Thread Szehon Ho
+1 (non-binding) Checked signature, checksum, ran basic tests on spark-4.0.0-bin-hadoop3 Thanks Szehon On Mon, May 19, 2025 at 9:07 PM Denny Lee wrote: > +1 (non-binding) > > On Mon, May 19, 2025 at 9:02 PM Rozov, Vlad > wrote: > >> +1 (non-binding) >> >> Vlad >> >> On May 19, 2025, at 8:56 

Re: [VOTE] Release Spark 4.1.0-preview1 (RC1)

2025-07-11 Thread Szehon Ho
+1 (non-binding) Checked signature, checksum, basic functionality of spark-4.1.0-preview1-bin-hadoop3 Thanks for setting this up ! Szehon On Thu, Jul 10, 2025 at 11:18 PM Yang Jie wrote: > +1 > > On 2025/07/11 04:23:27 Ángel Álvarez Pascua wrote: > > +1 (non-binding) > > > > El jue, 10 jul 202

Re: [VOTE] SPIP: Monthly preview release

2025-07-03 Thread Szehon Ho
+1 (non-binding) Thanks for the proposal, hope one day to get faster releases in Spark. Thanks Szehon On Thu, Jul 3, 2025 at 6:58 AM Sandy Ryza wrote: > +1 (non-binding) > > On Thu, Jul 3, 2025 at 6:47 AM Jules Damji wrote: > >> +1 (non-binding) >> — >> Sent from my iPhone >> Pardon the dumb