Yes, and again, we are not giving up. If/once we will have a well motivated use case for the distributed tracing, we can then try to figure out how to implement it.
Anyway, I will open the voting thread later today! Thanks for the comments. Best, Piotrek śr., 4 gru 2024 o 23:36 Roman Khachatryan <ro...@apache.org> napisał(a): > Thanks for the pointers, I also took a look at the Otel API and also > couldn't find the way to "compose" a context instead of propagating it > (which IMO wouldn't be desirable in Flink). > So your proposal makes sense to me - as it keeps it simple. > > Regards, > Roman > > > On Tue, Dec 3, 2024 at 12:30 PM Piotr Nowojski <pnowoj...@apache.org> > wrote: > > > Hi Roman! > > > > > 1. Why do we give up on the idea of reporting child spans independently > > > from the parent? I couldn't find much details in the Rejected > > Alternatives > > > section > > > > We are not giving up on it. But the issue is how to connect related spans > > if > > they are reported independently, potentially on different machines after > a > > job > > failover. > > > > This has been discussed in the past in the FLIP-384 thread [1]. In > > particular > > please check this response from me [2]. > > > > [1] https://lists.apache.org/thread/7lql5f5q1np68fw1wc9trq3d9l2ox8f4 > > [2] https://lists.apache.org/thread/cznt6rbncx1ydqcn13m52859qrggq1xg > > > > > 2. If at some point we come up with a way to address (1), then having a > > > reference from child to parent would be more flexible? And probably not > > in > > > the form of object reference, but just as a (String) identifier? > > > > I'm not sure. Maybe it will be a string identifier, maybe something else > > that is less prone to errors? Exposing a "String" identifier, pushes the > > problem to users. Also AFAIR it's not clear how to convert a custom > > string into Otel's `Context` class . But I might be wrong with this last > > one. > > > > Best, > > Piotrek > > > > > > czw., 14 lis 2024 o 13:21 Roman Khachatryan <ro...@apache.org> > napisał(a): > > > > > Hi Piotr, thanks for the proposal, > > > > > > I see the need for reporting child spans, however I have a couple of > > > questions about the proposed design: > > > > > > 1. Why do we give up on the idea of reporting child spans independently > > > from the parent? I couldn't find much details in the Rejected > > Alternatives > > > section > > > > > > 2. If at some point we come up with a way to address (1), then having a > > > reference from child to parent would be more flexible? And probably not > > in > > > the form of object reference, but just as a (String) identifier? > > > > > > > > > Regards, > > > Roman > > > > > > > > > On Thu, Nov 7, 2024 at 2:41 PM Piotr Nowojski <pnowoj...@apache.org> > > > wrote: > > > > > > > Hi all! > > > > > > > > I would like to open up for discussion a new FLIP-483 [1]. > > > > > > > > Motivation > > > > FLIP-384 [2] added trace/span reporting capability to Flink, which > has > > > been > > > > used in a couple of places, like reporting checkpointing and recovery > > > > processes. > > > > > > > > With flat/childless structure of spans it is difficult to accurately > > > report > > > > checkpointing or recovery. Single top level span for checkpointing or > > > > recovery is currently aggregating some metrics, like maximum and sum > of > > > how > > > > long did the state download/upload take. However this hides some > > details, > > > > like how long each task and/or subtask was downloading the state. > > > > > > > > In this FLIP we want to introduce a general mechanism for reporting > > > > children spans. > > > > > > > > For more information please look into the FLIP-483 [1]. > > > > > > > > I'm looking forward to your thoughts on this. > > > > > > > > Best, > > > > Piotrek > > > > > > > > [1] https://cwiki.apache.org/confluence/x/4IyMEw > > > > [2] https://cwiki.apache.org/confluence/x/TguZE > > > > > > > > > >