Hi Rui,

1) Flink should always fetch records from Kafka if there some independent
of the parallelism of the consumer. The only problem which could appear is
that if you set the parallelism higher than the number of partitions, some
of the source operators won't get a partition assigned. Due to that, they
won't read data from Kafka.

2) numRecordsOutPerSecond should give you a good value for the throughput.

3) Could you check the logs whether the GraphiteReporter could be properly
started? Maybe the reporter jar has not been put in the lib folder.

Cheers,
Till

On Tue, Sep 25, 2018 at 11:30 AM Wang, Rui2 <rui2.w...@intel.com> wrote:

> Hi Till,
>
>
>
> I have integrated Kafka 2.0 with Flink1.5.2, and it works well. Now, I
> need to know the performance of Flink, such as throughput vs latency.
>
> Here I just set my stream pipeline as: Kafka(source) -> flatMap -> map ->
> print(sink), with Kafka partitions:  32.  I confront some issues:
>
>
>
>   1). If set each parallelism to 1, the job runs ok, but flatMap couldn’t
> receive records from Kafka.
>
> Well, I just set the parallelism of flatMap above 1, the pipeline work
> fine. Does  Flink must be parallel when fetch records from Kafka?
>
>
>
>   2).Can I get the throughput by "numRecordsOutPerSecond"?
>
> Also, I think it may be right to override the open function which extends
> RichFlatMapFunction.
>
>
>
>   3).Regarding to the latency of operator(e.g. flatMap), after setting
> env.getConfig().setLatencyTrackingInterval(1000),  I can get the latency
> only by the URL as below:
>
> input :
> http://server:8081/jobs/jobid/metrics?get=latency.source_id.X1.source_subtask_index.22.operator_id.Y1.operator_subtask_index.22.latency_stddev
>
> result:
> [{"id":"latency.source_id.X1.source_subtask_index.22.operator_id.Y1.operator_subtask_index.22.latency_stddev","value":"0.3854355088676483"}]
>
>
>
> But, it's a poor user experience, so I want a third-party to gather the
> value and display it, such as Graphite. I have configured the metrics in
> flink-conf.yaml as list below,
>
>
>
> metrics.reporters: varuy
>
> metrics.reporters.varuy.host: tracing115
>
> metrics.reporters.varuy.port: 2003
>
> metrics.reporters.varuy.class:
> org.apache.flink.metrics.graphite.GraphiteReporter
>
> metrics.reporters.varuy.interval: 1 SECONDS
>
> metrics.reporters.varuy.protocol: TCP
>
>
>
> when use cmd "lsof -i:2003" nothing return, I think the graphite seems not
> work.
>
> Could you give some advice to continue my work when you’re free, I’m
> looking forward to your reply!
>
>
>
> Best Regards & Thanks
>
> Rui, Wang
>

Reply via email to