Hi team, I’m following up on my earlier email regarding the midnight-UTC timestamp serialization issue in Kafka Connect. Could you please advise whether this is expected behavior or if I should open a JIRA ticket?
I have signed up for JIRA, but it appears I haven't received any mails regarding the same as well. If needed, could you please grant access or advise the process? On Sun, Nov 30, 2025 at 12:29 PM Vinayak Gaikwad < [email protected]> wrote: > Hi, > I am encountering an issue with how timestamp headers are serialized in > Kafka Connect and would appreciate clarification on whether this is the > intended behavior or a bug. > > Problem: Incorrect Serialization of Midnight UTC Timestamps > When using connectors that produce timestamp headers (e.g., the MongoDB > source connector), timestamps that fall exactly at midnight UTC are being > serialized incorrectly, losing their time component. Timestamps at any > other time of day are serialized correctly. > Example: > - Input Timestamp: 2025-01-01T00:00:00.000Z (midnight UTC) > - Expected Serialized Value: "2025-01-01T00:00:00.000Z" > - Actual Serialized Value: "2025-01-01" (time component lost) > > Analysis > I traced this to the org.apache.kafka.connect.data.Values.dateFormatFor() > <https://github.com/apache/kafka/blob/0e1c6fb6bb503aeda27ce1d73cd827b7a227d769/connect/api/src/main/java/org/apache/kafka/connect/data/Values.java#L769-L777> > which > determines the format based on the millisecond value: > > // Simplified code for reference > if (value.getTime() % MILLIS_PER_DAY == 0) { > return DATE_FORMAT; // "yyyy-MM-dd" > } else { > return TIMESTAMP_FORMAT; // "yyyy-MM-dd'T'HH:mm:ss.SSS'Z'" > } > > For midnight UTC timestamps, the condition value.getTime() % > MILLIS_PER_DAY == 0 is true, causing the method to return the DATE format, > even if the value's schema is a Timestamp logical type. The issue is that > dateFormatFor() is not schema-aware and makes an assumption based solely on > the millisecond value. > I have created a test case that reproduces and confirms this failure. > > Questions*:* > 1. Is this behavior intended? Should timestamps at midnight UTC be > formatted differently than other timestamps? > 2. Should the logical type be respected? When a value has > Timestamp.SCHEMA, should it always be formatted with the full timestamp > format (yyyy-MM-dd'T'HH:mm:ss.SSS'Z') regardless of the millisecond value? > > Environment > - Kafka Version: 4.1.1 > - Component: connect-api module (org.apache.kafka.connect.data.Values) > > I would appreciate any guidance on whether this is expected behavior or if > I should file a JIRA issue. I am happy to help with a patch if needed. > Thank you for your time and for maintaining Kafka. > > Best regards, > Vinayak Gaikwad. > Thanks, Vinayak Gaikwad
