Hi Max,
I saw the vote on this topic so let us conclude this discussion.
Actually Oracle does not support this TIME data type but Google
BigQuery does support it. if i was going to argue in favour of having it in
Spark SQL, I would consider the following as part of cost/benefit analysis:
Perform
Implenting the type seems a good proposal to handle the mention use cases,
mainly the migration of data.
Many circuitous code can be written to handle such scenario, but nothing
beats a straightforward type implementation IMO.
Thanks,
Subhasis Mukherjee
On Mon, Feb 17, 2025, 9:37 PM Max Gekk wro
Hello Mich,
Thank you for the provided code, but it seems useless in the cases that I
described above. No doubt that you can emulate the TIME type via STRING as
well as other types. Let me highlight the cases when direct support of the
new type by Spark SQL could be useful for users:
1. Load the T
hm, I tried the attached code. This code tries to simulates handling TIME
data in Spark using Parquet files. Since Spark does not support a direct
TIME datatype, it follows these steps:
- Stores time as a STRING in a Parquet file using PyArrow.
- Reads the Parquet file using PyArrow, Pandas,
Hello Mich,
> However, if you only need to work with time, you can do like below
1. Let's say a Spark SQL user would like to load TIME values stored in
files in the parquet format which supports the TIME logical type
https://github.com/apache/parquet-format/blob/master/LogicalTypes.md#time.
None
Not entirely convinced we need it!
For example, Oracle does not have it.Oracle treats date and time as a
single entity, as they are often used together in real-world applications.
This approach simplifies many operations, such as sorting, filtering, and
calculations involving both date and time. H
Thanks for the proposal, Max. This looks very promising. I'd also be happy
to contribute if it helps with task completion!
Regards,
Sakthi
On Wed, Feb 12, 2025 at 10:36 AM Max Gekk wrote:
> Hi Dongjoon,
>
> > According to SPIP, is this targeting Apache Spark 4.2.0?
>
> Some tasks could be done
Hi Dongjoon,
> According to SPIP, is this targeting Apache Spark 4.2.0?
Some tasks could be done in parallel, but if only one person will work on
this sequentially, in the worst case it might be finished close to 4.2.0.
Best regards,
Max Gekk
On Wed, Feb 12, 2025 at 5:48 PM Dongjoon Hyun wrote
According to SPIP, is this targeting Apache Spark 4.2.0?
> Q7. How long will it take?
> In total it might take around 9 months.
Dongjoon.
On 2025/02/12 09:38:56 Max Gekk wrote:
> Hi All,
>
> I would like to propose a new data type TIME which represents only time
> values without the date part
Hi All,
I would like to propose a new data type TIME which represents only time
values without the date part comparing to TIMESTAMP_NTZ. New type should
improve:
- migrations of SQL code from other DBMS where such type is supported
- read/write it from/to data sources such as parquet
- conform to
10 matches
Mail list logo