Re: [VOTE] SPIP: Support NanoSecond Timestamps

2025-03-27 Thread Micah Kornfield
> > I think the key issue is the format. The proposed 10-byte format doesn't > seem like a standard and the one in Iceberg/Parquet does not support the > required range by ANSI SQL: year 0001 to year . We should address this > issue first. Note that Parquet has an INT96 timestamp that supports

Unsubscribe

2025-03-27 Thread Yujung Dong

Re: [VOTE] SPIP: Support NanoSecond Timestamps

2025-03-27 Thread Wenchen Fan
Maybe we should discuss the key issues on the dev list as it's easy to lose track of Google Doc comments. I think all the proposals for adding new data types need to prove that the new data type is common/standard in the ecosystem. This means 3 things: - it has common/standard semantic. TIMESTAMP

Re: Revert of [SPARK-51229][BUILD][CONNECT] Fix dependency:analyze goal on connect common

2025-03-27 Thread Rozov, Vlad
https://github.com/apache/spark/pull/50437 IMO, it will be better to keep 2 separate commits, one undo revert and one fix, so fix for guava is properly documented. Also, while testing, I see that if I exit the shell and start it again, it fails. Thank you, Vlad On Mar 27, 2025, at 2:33 PM, H

Spark build failed> File line length exceeds 100 characters

2025-03-27 Thread Ángel Álvarez Pascua
Hi, I'm trying to build the project, but I'm encountering multiple errors due to long lines. Is this expected? I built the project a few weeks ago and don’t recall seeing these errors. Is anyone else experiencing the same issue? [image: image.png] Thanks in advance.

Re: [VOTE] SPIP: Support NanoSecond Timestamps

2025-03-27 Thread DB Tsai
Thanks!!! DB Tsai | https://www.dbtsai.com/ | PGP 42E5B25A8F7A82C1 > On Mar 27, 2025, at 3:56 PM, Qi Tan wrote: > > Thanks DB, > > I just noticed a few more comments came in after I initiated the vote. I'm > going to postpone the voting process and address those outstanding comments. > >

Re: [VOTE] SPIP: Support NanoSecond Timestamps

2025-03-27 Thread Qi Tan
Thanks DB, I just noticed a few more comments came in after I initiated the vote. I'm going to postpone the voting process and address those outstanding comments. Qi Tan DB Tsai 于2025年3月27日周四 15:12写道: > Hello Qi, > > I'm supportive of the NanoSecond Timestamps proposal; however, before we > in

Re: [VOTE] SPIP: Support NanoSecond Timestamps

2025-03-27 Thread DB Tsai
Hello Qi, I'm supportive of the NanoSecond Timestamps proposal; however, before we initiate the vote, there are a few outstanding comments in the SPIP document that haven't been addressed yet. Since the vote is on the document itself, could we resolve these items beforehand? For example: The d

Re: Revert of [SPARK-51229][BUILD][CONNECT] Fix dependency:analyze goal on connect common

2025-03-27 Thread Hyukjin Kwon
Vlad, let's open a PR and discuss it there. We have many other committees to review / help with as well. On Fri, Mar 28, 2025 at 6:28 AM Rozov, Vlad wrote: > Hi Hyukjin, > > I open https://issues.apache.org/jira/browse/SPARK-51643 and > https://issues.apache.org/jira/browse/SPARK-51644. Please a

Re: Revert of [SPARK-51229][BUILD][CONNECT] Fix dependency:analyze goal on connect common

2025-03-27 Thread Rozov, Vlad
Hi Hyukjin, I open https://issues.apache.org/jira/browse/SPARK-51643 and https://issues.apache.org/jira/browse/SPARK-51644. Please add more details to the first JIRA. As far as I can see https://github.com/vrozov/spark/tree/spark-shell should fix both JIRAs and if not I’d like to understand wh

Re: [VOTE] SPIP: Support NanoSecond Timestamps

2025-03-27 Thread huaxin gao
+1 On Thu, Mar 27, 2025 at 1:22 PM Qi Tan wrote: > Hi all, > > I would like to start a vote on adding support for nanoseconds timestamps. > > *Discussion thread: * > https://lists.apache.org/thread/y2vzrjl1499j5dvbpg3m81jxdhf4b6of > *SPIP:* > https://docs.google.com/document/d/1wjFsBdlV2YK75x7UO

[VOTE] SPIP: Support NanoSecond Timestamps

2025-03-27 Thread Qi Tan
Hi all, I would like to start a vote on adding support for nanoseconds timestamps. *Discussion thread: * https://lists.apache.org/thread/y2vzrjl1499j5dvbpg3m81jxdhf4b6of *SPIP:* https://docs.google.com/document/d/1wjFsBdlV2YK75x7UOk2HhDOqWVA0yC7iEiqOMnNnxlA/edit?usp=sharing *JIRA:* https://issue

Re: Re: [VOTE] SPIP: Constraints in DSv2

2025-03-27 Thread Anton Okolnychyi
Casting my own +1 (non-binding). Angel, I echo what Wenchen said. Connectors and Spark interact via DSv2, therefore it requires changes in that layer. It is going to be optional but will make a ton of sense for many connectors, especially in modern open table formats that decouple table metadata f

Re: Requesting advice, thought

2025-03-27 Thread Asif Shahid
Thanks for the reply. That helps.. On Thu, Mar 27, 2025, 7:29 AM Wenchen Fan wrote: > The file source in Spark has not been migrated to DS v2 yet and uses > dedicated catalyst rules to do runtime filtering, e.g. PartitionPruning > and PlanDynamicPruningFilters > > On Thu, Mar 27, 2025 at 6:53 PM

Re: Requesting advice, thought

2025-03-27 Thread Wenchen Fan
The file source in Spark has not been migrated to DS v2 yet and uses dedicated catalyst rules to do runtime filtering, e.g. PartitionPruning and PlanDynamicPruningFilters On Thu, Mar 27, 2025 at 6:53 PM Asif Shahid wrote: > Hi Experts, > Could you please allow me to pick your brain on the follo

Re: Revert of [SPARK-51229][BUILD][CONNECT] Fix dependency:analyze goal on connect common

2025-03-27 Thread Mark Hamstra
Back in the very early days of Spark (before it was even an Apache Incubator project), Maven was clearly a more mature, capable and stable tool suite for building, testing and publishing JVM code, even Scala code, so some of the earliest commercial adopters of Spark relied upon Maven. It made sense

Requesting advice, thought

2025-03-27 Thread Asif Shahid
Hi Experts, Could you please allow me to pick your brain on the following: For Hive Tables ( managed), the scan operator is FileSourceScanExec. Is there any particular reason why its underlying HadoopFSRelations' field, FileFormat does not implement an interface like SupportsRuntimeFiltering ? Li

Re:Re: [VOTE] SPIP: Constraints in DSv2

2025-03-27 Thread beliefer
+1 在 2025-03-26 14:45:09,"Chao Sun" 写道: +1 On Tue, Mar 25, 2025 at 10:22 PM Ángel Álvarez Pascua wrote: I meant ... a data validation API would be great, but why in the DSv2? isn't data validation something more general? do we have to use DSv2 to have our data validated? El mié, 26