Hi Piotr, thanks for the proposal,

I see the need for reporting child spans, however I have a couple of
questions about the proposed design:

1. Why do we give up on the idea of reporting child spans independently
from the parent? I couldn't find much details in the Rejected Alternatives
section

2. If at some point we come up with a way to address (1), then having a
reference from child to parent would be more flexible? And probably not in
the form of object reference, but just as a (String) identifier?


Regards,
Roman


On Thu, Nov 7, 2024 at 2:41 PM Piotr Nowojski <pnowoj...@apache.org> wrote:

> Hi all!
>
> I would like to open up for discussion a new FLIP-483 [1].
>
> Motivation
> FLIP-384 [2] added trace/span reporting capability to Flink, which has been
> used in a couple of places, like reporting checkpointing and recovery
> processes.
>
> With flat/childless structure of spans it is difficult to accurately report
> checkpointing or recovery. Single top level span for checkpointing or
> recovery is currently aggregating some metrics, like maximum and sum of how
> long did the state download/upload take. However this hides some details,
> like how long each task and/or subtask was downloading the state.
>
> In this FLIP we want to introduce a general mechanism for reporting
> children spans.
>
> For more information please look into the FLIP-483 [1].
>
> I'm looking forward to your thoughts on this.
>
> Best,
> Piotrek
>
> [1] https://cwiki.apache.org/confluence/x/4IyMEw
> [2] https://cwiki.apache.org/confluence/x/TguZE
>

Reply via email to