I'm considering distributing tracing in the context of Flink. I have the following questions:
1. How to implement tracing internally within the Flink pipeline itself? That is, how to propagate the tracing context between the different operators from the sources to the sink? 2. How to glue things at the edges? That is, extract the context when reading and putting it when writing? For 2 in particular I just need to support Kafka sources & sinks. I guess the typical thing would be to use the kafka headers for that, as described in [1]. This also has the advantage that it does not require changes in the payload schemas for example. More generally, are there any (Flink-specific) libraries/integrations available which facilitate the task at hand? E.g., by decorating transformations with tracing capabilities as done in [2]. For what it's worth, I'm mostly interested in solutions based on OpenTracing and/or OpenTelemetry. --- [1] https://docs.immerok.cloud/docs/how-to-guides/development/reading-apache-kafka-headers-with-apache-flink/ [2] https://github.com/opentracing-contrib/java-kafka-client --- Posted in SO too: - https://stackoverflow.com/questions/76394899/best-practices-for-distributed-tracing-in-flink