Thanks for reviewing and providing the feedback.

> 1) Does it make sense to drop the *record *part from the metric name as it
doesn't seem to serve much purpose? I would rather call the metric as
*source-poll-errors-total

Yes, "records" is not needed and misleading.

> Staying on names, I am thinking, does it make more sense to have
*failures* in the name instead of *errors *i.e.*
source-poll-failures-total* and
*source-poll-failures-rate*? What do you think?

Agree, "failures" is a more appropriate term here.

> Regarding the inclusion of retriable exceptions, as of today, source
tasks don't retry even in cases of RetriableException. A PR was created to
modify this behaviour (https://github.com/apache/kafka/pull/13726) but the
reason I bring it up is that in that PR, the failures etc for retry context
would be computed from the RetryWithToleranceOperator. I am not sure when
would that get merged, but does it change the failure counting logic in any
ways?

In my opinion, we should ignore retryable exceptions when SourceTasks
switches to using RetryWithToleranceOperator. I can update the KIP to call
this out. If the PR for this KIP is implemented first, we can include both
retriable and non-retriable exceptions. I can also add a comment on
https://github.com/apache/kafka/pull/13726 to remove them. What do you
think?

Thank you


On Wed, Jun 28, 2023 at 1:09 PM Sagar <sagarmeansoc...@gmail.com> wrote:

> Hey Ravindra,
>
> Thanks for the KIP! It appears to be a useful addition to the metrics to
> understand poll related failures which can go untracked as of now. I just
> have a couple of minor comments:
>
> 1) Does it make sense to drop the *record *part from the metric name as it
> doesn't seem to serve much purpose? I would rather call the metric as
> *source-poll-errors-total
> *and *source-poll-errors-rate*.
> 2) Staying on names, I am thinking, does it make more sense to have
> *failures* in the name instead of *errors *i.e.*
> source-poll-failures-total* and
> *source-poll-failures-rate*? What do you think?
> 3) Regarding the inclusion of retriable exceptions, as of today, source
> tasks don't retry even in cases of RetriableException. A PR was created to
> modify this behaviour (https://github.com/apache/kafka/pull/13726) but the
> reason I bring it up is that in that PR, the failures etc for retry context
> would be computed from the RetryWithToleranceOperator. I am not sure when
> would that get merged, but does it change the failure counting logic in any
> ways?
>
> Thanks!
> Sagar.
>
>
> On Sun, Jun 25, 2023 at 12:40 AM Ravindra Nath Kakarla <
> ravindhran...@gmail.com> wrote:
>
> > Hello,
> >
> > I would like to start a discussion on KIP-933 to add new metrics to Kafka
> > Connect that helps  monitoring polling failures with source connectors.
> >
> >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-933%3A+Publish+metrics+when+source+connector+fails+to+poll+data
> >
> > Looking forward to feedback on this.
> >
> > Thank you,
> > Ravindranath
> >
>

Reply via email to