I hope this question makes sense, I’m kinda a newbie when it comes to reasoning about distributed systems.
Let’s say I have a consumer that needs to be able to detect when a given message was delayed by some period of time (could be due to network partition or producer errors or whatever). By delayed I mean that, E.G. a producer learned of some real-world event that happened at time A, and ideally would have communicated a message to a Kafka topic within, say, 5 seconds of learning about that event, but because of an error ends up actually producing the message to Kafka 10 minutes later. I might have a consumer that needs to detect that delay, to know that that real-world event actually happened 10 minutes ago. Is there a best practice or a common pattern that is employed by the Kafka community for dealing with this sort of thing, something more sophisticated and robust than just comparing timestamps and hoping that the clocks of the producer(s) and consumer(s) are more-or-less in sync? E.G. vector clocks, etc? (Something hopefully more accessible than atomic clocks and GPS.) I guess what I’m concerned about is clock drift… some things I’ve read lately have lead me to think that perhaps I can’t really trust timestamps naively attached to messages by producers, because the various producers and consumers in a system could have clocks that are significantly divergent. (I’m working with a client that uses NTP to try to keep all node clocks in sync, but has experienced many problems with this approach.) It’s entirely possible that I’m thinking about this all wrong, but if that’s the case I’d greatly appreciate being pointed in the right direction. Thank you! Avi