Hi Rusty,
> I'm thinking we might want to revisit the decision to exclude
24:00:00. Because if databases have data at 24:00:00 (i.e. an event
happened at a leap second) that information would be lost if they
serialized it to Arrow then read the data back.
That's a fair point, but we'll have to make sure that any compute
element that handles TIME data is able to handle those leap-second
values, or at worse error out gracefully (rather than e.g. crash or UB).
Regards
Antoine.
Le 25/02/2026 à 19:54, Rusty Conover a écrit :
Hi Arrow Friends,
On the Arrow call today, I brought up an issue of the TIME type not supporting
24:00:00.
I'm doing compatibility checking against various databases (but mostly DuckDB)
and this complicates the ability to round-trip data through IPC and back while
preserving full fidelity at the edges of the time type. If a database contains
a value of `24:00:00` and that value cannot be represented in Arrow,
serializing to IPC and reading it back will necessarily lose information. My
bias here is simple: users should be able to move their data through Arrow
without alteration.
24:00:00 is supported by these databases: PostgreSQL, DB2, Cockroach, Redshift,
DuckDB, MySQL, MariaDB, TiDB, ClickHouse
It is not supported by: SQL Server, Snowflake, BigQuery, Firebird, H2.
You can find my analysis and links to the docs here:
https://rusty.today/blog/time-data-type-compatibility-across-databases/
There have been previous discussions (and votes) about this ~2021:
"Clarifying interpretation of Time32/Time64 past 24 hours" -
https://lists.apache.org/thread/ckms85lx5649z84jqhtowl7zvr71kxkr
And there was a vote to exclude 24:00:00.
I don’t have a strong opinion about leap seconds specifically, but I do care
deeply about data fidelity and avoiding avoidable precision loss in IPC. Given
the real-world database landscape, it may be time to reconsider whether
excluding `24:00:00` best serves us all now.
Cheers,
Rusty
https://query.farm