[ 
https://issues.apache.org/jira/browse/FLINK-39059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-39059:
-----------------------------------
    Labels: pull-request-available  (was: )

> Add unified metrics support to AsyncPredictFunction and PredictFunction
> -----------------------------------------------------------------------
>
>                 Key: FLINK-39059
>                 URL: https://issues.apache.org/jira/browse/FLINK-39059
>             Project: Flink
>          Issue Type: Sub-task
>            Reporter: featzhang
>            Priority: Major
>              Labels: pull-request-available
>
> h3. Subtask: Add Built-in Metrics for Model Inference Functions
> *Description*
> Introduce unified, built-in metrics support for model inference in Flink by 
> enhancing both {{PredictFunction}} and {{{}AsyncPredictFunction{}}}. The goal 
> is to provide consistent observability for inference workloads without 
> requiring changes in individual model implementations.
> *Scope*
>  * Add common metrics instrumentation to the base inference function classes.
>  * Ensure both synchronous and asynchronous inference paths are covered.
>  * Automatically enable metrics for all existing and future model connectors 
> (e.g., OpenAI, Triton).
> *Metrics Included*
>  * {{{}inference_requests{}}}: Total number of inference requests.
>  * {{{}inference_requests_success{}}}: Number of successful inference 
> requests.
>  * {{{}inference_requests_failure{}}}: Number of failed inference requests.
>  * {{{}inference_latency{}}}: Histogram of inference latency in milliseconds.
>  * {{{}inference_rows_output{}}}: Total number of output rows produced by 
> inference.
> *Extensibility*
>  * Provide a {{createLatencyHistogram()}} hook method.
>  * Allow subclasses to customize latency histogram behavior (e.g., bucket 
> configuration).
> *Acceptance Criteria*
>  * Metrics are registered automatically without modifying existing model 
> implementations.
>  * Metrics are exposed consistently for both {{PredictFunction}} and 
> {{{}AsyncPredictFunction{}}}.
>  * No regression in existing inference functionality.
>  * Metrics names and semantics are aligned with Flink metrics conventions.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to