Hi Rusty,

> I'm thinking we might want to revisit the decision to exclude 24:00:00. Because if databases have data at 24:00:00 (i.e. an event happened at a leap second) that information would be lost if they serialized it to Arrow then read the data back.

That's a fair point, but we'll have to make sure that any compute element that handles TIME data is able to handle those leap-second values, or at worse error out gracefully (rather than e.g. crash or UB).

Regards

Antoine.


Le 25/02/2026 à 19:54, Rusty Conover a écrit :
Hi Arrow Friends,

On the Arrow call today, I brought up an issue of the TIME type not supporting 
24:00:00.

I'm doing compatibility checking against various databases (but mostly DuckDB) 
and this complicates the ability to round-trip data through IPC and back while 
preserving full fidelity at the edges of the time type.  If a database contains 
a value of `24:00:00` and that value cannot be represented in Arrow, 
serializing to IPC and reading it back will necessarily lose information. My 
bias here is simple: users should be able to move their data through Arrow 
without alteration.


24:00:00 is supported by these databases: PostgreSQL, DB2, Cockroach, Redshift, 
DuckDB, MySQL, MariaDB, TiDB, ClickHouse

It is not supported by: SQL Server, Snowflake, BigQuery, Firebird, H2.

You can find my analysis and links to the docs here: 
https://rusty.today/blog/time-data-type-compatibility-across-databases/

There have been previous discussions (and votes) about this ~2021:

"Clarifying interpretation of Time32/Time64 past 24 hours" - 
https://lists.apache.org/thread/ckms85lx5649z84jqhtowl7zvr71kxkr

And there was a vote to exclude 24:00:00.

I don’t have a strong opinion about leap seconds specifically, but I do care 
deeply about data fidelity and avoiding avoidable precision loss in IPC. Given 
the real-world database landscape, it may be time to reconsider whether 
excluding `24:00:00` best serves us all now.

Cheers,

Rusty
https://query.farm

Reply via email to