Please check the logs for warnings. It could be that a metric registered by a job is throwing exceptions.

On 20/04/2022 18:45, Peter Schrott wrote:
Hi kuweiha,

Just to confirm, you tried with 1.15 - none of the rcs are working for me?

This port is definitely free as it was already used on the same hosts with Flink 1.14.4. And as I said, when no job is running on the taskmanager it actually reports metrics on that certain port - I only get the "empty response" when a job is running on the taskmanager I am querying. Did you also run a job and could you access metrics like flink_taskmanager_job_*?

The logs only tell me that everything is working fine:
2022-04-20 13:46:39,597 INFO  [main] o.a.f.r.metrics.MetricRegistryImpl:? - Reporting metrics for reporter prom of type org.apache.flink.metrics.prometheus.PrometheusReporter.
and
2022-04-20 12:12:26,394 INFO  [main] o.a.f.m.p.PrometheusReporter:? - Started PrometheusReporter HTTP server on port 4444

Best & thanks,
Peter


On Wed, Apr 20, 2022 at 6:30 PM huweihua <huweihua....@gmail.com> wrote:

    Hi, Peter
    I have not been able to reproduce this problem.

    From your description, it is possible that the specified port 4444
    has been listened by other processes, and PrometheusReporter
    failed to start.
    You can confirm it from taskmanager.log, or check if port 4444 of
    the host is being listened by the TaskManager process.


    2022年4月20日 下午10:48,Peter Schrott <pe...@bluerootlabs.io> 写道:

    Hi Flink-Users,

    After upgrading to Flink 1.15 (rc3) (coming from 1.14) I noticed
    that there is a problem with the metrics exposed through the
    PrometheusReporter.

    It is configured as followed in the flink-config.yml:
    metrics.reporters: prom
    metrics.reporter.prom.class:
    org.apache.flink.metrics.prometheus.PrometheusReporter
    metrics.reporter.prom.port: 4444

    My cluster is running in standalone mode with 2 taskmanagers and
    2 jobmanagers.

    More specifically:

    On the taskmanger that runs a job I get curl: (52) Empty reply
    from server when I call curl localhost:4444. I was looking for
    the metrics in the namespace flink_taskmanager_job_*, which are
    only - and obviously - exposed on the taskmanager running a job.

    On the other taskmanger that runs no job I get a response with a
    couple of metrics of the namespace flink_taskmanager_Status- as
    expected.

    When configuring the JMXReporterFactory for too. I find the
    desired and all other metrics via VisualVM on that
    taskmanager running the job. Also in the Flink web ui, in the
    "Jobs -> Overview -> Metrics" part I can select and visualize
    metrics like flink_taskmanager_job_task_busyTimeMsPerSecond.

    Does someone have any idea what's going on here? maybe even
    confirm my findings?

    Best & thanks,
    Peter


Reply via email to