Hi Hong,

Thanks for starting the discussion.

seems to be using the cached version of the entire Execution graph (stale
> data), when it could just use the CheckpointStatsCache directly


CheckpointStatsCache is also populated using the "cached execution graph,"
so there is nothing to gain from the "staleness" pov; see
AbstractCheckpointHandler for more details.

Anyone aware of a reason we don’t do this already?
>

The CheckpointStatsCache is populated lazily on the request for a
particular checkpoint (so it might not have a full view); the used data
structure is also slightly different; one more thing is that
CheckpointStatsCache is meant for different purpose -> keeping a particular
checkpoint around while it's being investigated. Otherwise, it might
expire; using it for "overview" would break this.

Configuration for web.refresh-interval controls both dashboard refresh rate
> and ExecutionGraph cache
>

This sounds reasonable as long as it falls back to "web.refresh-interval"
when not defined. For consistency reasons, it should be also named
"rest.cache-timeout"


> Cache-Control on the HTTP headers.
>

In general, I'd be in favor of this ("rest.cache-timeout" would then need
to become "rest.default-cache-timeout"), but I need to see a detailed FLIP
because in my mind this could get quite complicated.

Best,
D.

On Fri, Jun 23, 2023 at 6:26 PM Teoh, Hong <lian...@amazon.co.uk.invalid>
wrote:

> Hi all,
>
> I have been looking at the Flink REST API implementation, and had some
> question on potential improvements. Looking to gather some thoughts:
>
> 1. Only use what is necessary. The GET /checkpoints API seems to be using
> the cached version of the entire Execution graph (stale data), when it
> could just use the CheckpointStatsCache directly. I am thinking of doing
> this refactoring. Anyone aware of a reason we don’t do this already?
> 2. Configuration for web.refresh-interval controls both dashboard refresh
> rate and ExecutionGraph cache. I am thinking of introducing a new
> configuration, rest.cache.timeout
> 3. Cache-Control on the HTTP headers. Seems like we are using caches in
> our REST endpoint. It would be step in the right direction to introduce
> cache-control in our REST API headers, so that we can improve the
> programmatic access of the Flink REST API.
>
>
> Looking forwards to hearing people’s thoughts.
>
> Regards,
> Hong
>
>

Reply via email to