Hi team,

I’m following up on my earlier email regarding the midnight-UTC timestamp
serialization issue in Kafka Connect. Could you please advise whether this
is expected behavior or if I should open a JIRA ticket?

I have signed up for JIRA, but it appears I haven't received any mails
regarding the same as well. If needed, could you please grant access or
advise the process?



On Sun, Nov 30, 2025 at 12:29 PM Vinayak Gaikwad <
[email protected]> wrote:

> Hi,
> I am encountering an issue with how timestamp headers are serialized in
> Kafka Connect and would appreciate clarification on whether this is the
> intended behavior or a bug.
>
> Problem: Incorrect Serialization of Midnight UTC Timestamps
> When using connectors that produce timestamp headers (e.g., the MongoDB
> source connector), timestamps that fall exactly at midnight UTC are being
> serialized incorrectly, losing their time component. Timestamps at any
> other time of day are serialized correctly.
> Example:
> - Input Timestamp: 2025-01-01T00:00:00.000Z (midnight UTC)
> - Expected Serialized Value: "2025-01-01T00:00:00.000Z"
> - Actual Serialized Value: "2025-01-01" (time component lost)
>
> Analysis
> I traced this to the org.apache.kafka.connect.data.Values.dateFormatFor()
> <https://github.com/apache/kafka/blob/0e1c6fb6bb503aeda27ce1d73cd827b7a227d769/connect/api/src/main/java/org/apache/kafka/connect/data/Values.java#L769-L777>
>  which
> determines the format based on the millisecond value:
>
> // Simplified code for reference
> if (value.getTime() % MILLIS_PER_DAY == 0) {
> return DATE_FORMAT; // "yyyy-MM-dd"
> } else {
> return TIMESTAMP_FORMAT; // "yyyy-MM-dd'T'HH:mm:ss.SSS'Z'"
> }
>
> For midnight UTC timestamps, the condition value.getTime() %
> MILLIS_PER_DAY == 0 is true, causing the method to return the DATE format,
> even if the value's schema is a Timestamp logical type. The issue is that
> dateFormatFor() is not schema-aware and makes an assumption based solely on
> the millisecond value.
> I have created a test case that reproduces and confirms this failure.
>
> Questions*:*
> 1. Is this behavior intended? Should timestamps at midnight UTC be
> formatted differently than other timestamps?
> 2. Should the logical type be respected? When a value has
> Timestamp.SCHEMA, should it always be formatted with the full timestamp
> format (yyyy-MM-dd'T'HH:mm:ss.SSS'Z') regardless of the millisecond value?
>
> Environment
> - Kafka Version: 4.1.1
> - Component: connect-api module (org.apache.kafka.connect.data.Values)
>
> I would appreciate any guidance on whether this is expected behavior or if
> I should file a JIRA issue. I am happy to help with a patch if needed.
> Thank you for your time and for maintaining Kafka.
>
> Best regards,
> Vinayak Gaikwad.
>


Thanks,
Vinayak Gaikwad

Reply via email to