Hi everyone,
I started the voting thread [1]. Please cast your vote there or ask
additional questions here.
Best,
Arvid
[1]
https://lists.apache.org/thread.html/r70d321b6aa62ab4e31c8b73552b2de7846c4d31ed6f08d6541a9b36e%40%3Cdev.flink.apache.org%3E
On Fri, Jul 30, 2021 at 10:46 AM Becket Qin w
Hi Arvid,
I think it is OK to leave eventTimeFetchLag out of the scope of this FLIP
given that it may involve additional API changes.
5. RecordMetadata is currently not simplifying any code. By the current
> design RecordMetadata is a read-only data structure that is constant for
> all records in
Hi folks,
To move on with the FLIP, I will cut out eventTimeFetchLag out of scope and
go ahead with the remainder.
I will open a VOTE later to today.
Best,
Arvid
On Wed, Jul 28, 2021 at 8:44 AM Arvid Heise wrote:
> Hi Becket,
>
> I have updated the PR according to your suggestion (note that
Hi Becket,
I have updated the PR according to your suggestion (note that this commit
contains the removal of the previous approach) [1]. Here are my
observations:
1. Adding the type of RecordMetadata to emitRecord would require adding
another type parameter to RecordEmitter and SourceReaderBase. S
Hi Arvid,
Thanks for the patient discussion.
To explain a bit more on the metric being a global variable, I think in
general there are two ways to pass a value from one code block to another.
The first way is direct passing. That means the variable is explicitly
passed from one code block to anot
Hi Becket,
I still can't follow your view on the metric being a global variable or
your concern that it is confusing to users. Nevertheless, I like your
proposal with having an additional collect method.
I was thinking that
> SourceOutput is going to have an additional method of collect(T Record,
Regarding the generic type v.s. class/subclasses of Metadata.
I think generic types usually make sense if the framework/abstract class
itself does not look into the instances, but just pass them from one user
logic to another. Otherwise, interfaces or class/subclasses would be
preferred.
In our c
Hi Arvid,
> I'm not sure if I follow the global variable argument, could you
elaborate? Are you referring specifically to the SettableGauge? How is that
different from a Counter or Meter?
What I meant is that the fetch lag computing logic can either get the
information required from method argumen
The only histogram implementation available to use are those by
dropwizard, and they do some lock-free synchronization stuff that so far
we wanted to keep out of hot paths (this applis to both reading and
writing); we have however never made benchmarks.
But it is reasonable to assume that they a
Hi all,
@Steven Wu
> Regarding "lastFetchTime" latency metric, I found Gauge to be less
> informative as it only captures the last sampling value for each metric
> publish interval (e.g. 60s).
> * Can we make it a histogram? Histograms are more expensive though.
> * Timer [1, 2] is cheaper as it
Hey Chesnay,
I think I got what that method was designed for now. Basically the
motivation is to let the SourceOutput to report the eventTimeFetchLag for
users. At this point, the SourceOutput only has the EventTime, so this
method provides a way for the users to pass the FetchTime to the
SourceOu
Would it be easier to understand if the method would accept a Supplier
instead?
On 20/07/2021 05:36, Becket Qin wrote:
In that case, do we still need the metric here? It seems we are creating a
"global variable" which users may potentially use. I am wondering how much
additional convenience it
Hi Arvid and Chesnay,
Thanks for the explanation.
As outlined in the JavaDoc and in the draft PR [1], it's up to the user to
> implement it in a way that fetch time always corresponds to the latest
> polled record.
In that case, do we still need the metric here? It seems we are creating a
"globa
Thanks, Arvid!
+1 for SinkWriterMetricGroup. Sink is a little more tricky, because it can
have local committer (running on TM) or global committer (running on JM).
In the future, it is possible to add SinkCommitterMetricGroup or
SinkGlobalCommitterMetricGroup.
Regarding "lastFetchTime" latency me
Hi Steven,
I extended the FLIP and its draft PR to have a SourceReaderMetricGroup and
a SplitEnumeratorMetricGroup. I hope that it makes it clearer.
I'd like to address FLINK-21000 as part of the implementation but I'd keep
it out of the FLIP discussion.
Question: should we rename SinkMetricGroup
To avoid confusion, can we either rename "SourceMetricGroup" to "
SplitReaderMetricGroup" or add "Reader" to the setter method names?
Yes, we should add the "unassigned/pending splits" enumerator metric. I
tried to publish those metrics for IcebergSourceEnumerator and ran into an
issue [1]. I don
Hi Steven,
The semantics are unchanged compared to FLIP-33 [1] but I see your point.
In reality, pending records would be mostly for event storage systems
(Kafka, Kinesis, ...). Here, we would report the consumer lag effectively.
If consumer lag is more prominent, we could also rename it.
For pe
I am trying to understand what those two metrics really capture
> G setPendingBytesGauge(G pendingBytesGauge);
- use file source as an example, it captures the remaining bytes for
the current file split that the reader is processing? How would users
interpret or use this metric? enumera
Hi Becket,
I believe 1+2 has been answered by Chesnay already. Just to add to 2: I'm
not the biggest fan of reusing task metrics but that's what FLIP-33 and
different folks suggested. I'd probably keep task I/O metrics only for
internal things and add a new metric for external calls. Then, we coul
Re 1: We don't expose the reuse* methods, because the proposed
OperatorIOMetricGroup is a separate interface from the existing
implementations (which will be renamed and implement the new interface).
Re 2: Currently the plan is to re-use the "new" numByesIn/Out counters
for tasks ("new" becaus
Hi Arvid,
Thanks for the proposal. I like the idea of exposing concrete metric group
class so that users can access the predefined metrics.
A few questions are following:
1. When exposing the OperatorIOMetrics to the users, we are also exposing
the reuseInputMetricsForTask to the users. Should w
Dear devs,
As a continuation and generalization of FLIP-33 (Standardize Connector
Metrics) [1], we'd like to discuss how we actually expose the standardized
operator metrics to users in terms of changes to the API.
Please check out the FLIP [2] and provide feedback.
Best,
Arvid
[1]
https://cwi
22 matches
Mail list logo