Re: [DISCUSS] Data Type framework

2025-09-11 Thread Holden Karau
+1 Twitter: https://twitter.com/holdenkarau Fight Health Insurance: https://www.fighthealthinsurance.com/ Books (Learning Spark, High Performance Spark, etc.): https://amzn.to/2MaRAG9 YouTube Live Streams: https://www.yo

Re: [DISCUSS] Data Type framework

2025-09-11 Thread Dongjoon Hyun
Sounds like a great plan! Thank you. +1 for the refactoring. Dongjoon. On Thu, Sep 11, 2025 at 1:04 PM Max Gekk wrote: > Hello Dongjoon, > > > can we do this migration safely in a step-by-step manner over multiple > Apache Spark versions without blocking any Apache Spark releases? > > Sure, we

Re: [DISCUSS] Data Type framework

2025-09-11 Thread Max Gekk
Hello Dongjoon, > can we do this migration safely in a step-by-step manner over multiple Apache Spark versions without blocking any Apache Spark releases? Sure, we can start from the TIME type, and refactor the existing pattern mathings. After that I would support new features of TIME using the f

Re: [DISCUSS] Data Type framework

2025-09-11 Thread Dongjoon Hyun
Thank you for sharing the direction, Max. Since this is internal refactoring, can we do this migration safely in a step-by-step manner over multiple Apache Spark versions without blocking any Apache Spark releases? The proposed direction itself looks reasonable and doable for me. Thanks, Dongj

Re: [DISCUSS] Data Type framework

2025-09-10 Thread serge rielau . com
I think this is a great idea. There is a signifcant backlog of types which should be added: E.g TIMESTAMP(9), TIMESTAMP WITH TIME ZONE, TIME WITH TIMEZONE, some sort of big decimal to name a few). Making these more "plug and play" is goodness. +1 On Sep 10, 2025, at 1:22 PM, Max Gekk wrote: H

[DISCUSS] Data Type framework

2025-09-10 Thread Max Gekk
Hi All, I would like to propose refactoring of internal operations over Catalyst's data types. In the current implementation, data types are handled in an adhoc manner, and processing logic is dispersed across the entire code base. There are more than 100 places where every data type is pattern m