Re: [DISCUSS] SPIP: Add the TIME data type

2025-02-23 Thread Mich Talebzadeh
Hi Max, I saw the vote on this topic so let us conclude this discussion. Actually Oracle does not support this TIME data type but Google BigQuery does support it. if i was going to argue in favour of having it in Spark SQL, I would consider the following as part of cost/benefit analysis: Perform

Re: [DISCUSS] SPIP: Add the TIME data type

2025-02-17 Thread Subhasis Mukherjee
Implenting the type seems a good proposal to handle the mention use cases, mainly the migration of data. Many circuitous code can be written to handle such scenario, but nothing beats a straightforward type implementation IMO. Thanks, Subhasis Mukherjee On Mon, Feb 17, 2025, 9:37 PM Max Gekk wro

Re: [DISCUSS] SPIP: Add the TIME data type

2025-02-17 Thread Max Gekk
Hello Mich, Thank you for the provided code, but it seems useless in the cases that I described above. No doubt that you can emulate the TIME type via STRING as well as other types. Let me highlight the cases when direct support of the new type by Spark SQL could be useful for users: 1. Load the T

Re: [DISCUSS] SPIP: Add the TIME data type

2025-02-13 Thread Mich Talebzadeh
hm, I tried the attached code. This code tries to simulates handling TIME data in Spark using Parquet files. Since Spark does not support a direct TIME datatype, it follows these steps: - Stores time as a STRING in a Parquet file using PyArrow. - Reads the Parquet file using PyArrow, Pandas,

Re: [DISCUSS] SPIP: Add the TIME data type

2025-02-12 Thread Max Gekk
Hello Mich, > However, if you only need to work with time, you can do like below 1. Let's say a Spark SQL user would like to load TIME values stored in files in the parquet format which supports the TIME logical type https://github.com/apache/parquet-format/blob/master/LogicalTypes.md#time. None

Re: [DISCUSS] SPIP: Add the TIME data type

2025-02-12 Thread Mich Talebzadeh
Not entirely convinced we need it! For example, Oracle does not have it.Oracle treats date and time as a single entity, as they are often used together in real-world applications. This approach simplifies many operations, such as sorting, filtering, and calculations involving both date and time. H

Re: [DISCUSS] SPIP: Add the TIME data type

2025-02-12 Thread Sakthi
Thanks for the proposal, Max. This looks very promising. I'd also be happy to contribute if it helps with task completion! Regards, Sakthi On Wed, Feb 12, 2025 at 10:36 AM Max Gekk wrote: > Hi Dongjoon, > > > According to SPIP, is this targeting Apache Spark 4.2.0? > > Some tasks could be done

Re: [DISCUSS] SPIP: Add the TIME data type

2025-02-12 Thread Max Gekk
Hi Dongjoon, > According to SPIP, is this targeting Apache Spark 4.2.0? Some tasks could be done in parallel, but if only one person will work on this sequentially, in the worst case it might be finished close to 4.2.0. Best regards, Max Gekk On Wed, Feb 12, 2025 at 5:48 PM Dongjoon Hyun wrote

Re: [DISCUSS] SPIP: Add the TIME data type

2025-02-12 Thread Dongjoon Hyun
According to SPIP, is this targeting Apache Spark 4.2.0? > Q7. How long will it take? > In total it might take around 9 months. Dongjoon. On 2025/02/12 09:38:56 Max Gekk wrote: > Hi All, > > I would like to propose a new data type TIME which represents only time > values without the date part

[DISCUSS] SPIP: Add the TIME data type

2025-02-12 Thread Max Gekk
Hi All, I would like to propose a new data type TIME which represents only time values without the date part comparing to TIMESTAMP_NTZ. New type should improve: - migrations of SQL code from other DBMS where such type is supported - read/write it from/to data sources such as parquet - conform to