Re: Colocating Compute

Dawid Wysakowicz Thu, 30 Jul 2020 06:04:18 -0700

Hi Satyam,

I think you can use the InputSplitAssigner also for streaming pipelines
through an InputFormat. You can use
StreamExecutionEnvironment#createInput or for SQL you can write your
source according to the documentation here:
https://ci.apache.org/projects/flink/flink-docs-release-1.11/dev/table/sourceSinks.html#dynamic-table-source


If you do not want to use an InputFormat I think there is no easy way to
do it now.

Best,

Dawid

On 29/07/2020 13:53, Satyam Shekhar wrote:
> Hello,
>
> I am using Flink v1.10 in a distributed environment to run SQL queries
> on batch and streaming data.
>
> In my setup, data is sharded and distributed across the cluster. Each
> shard receives streaming updates from some external source. I wish to
> minimize data movement during query evaluation for performance
> reasons. For that, I need some construct to advise Flink planner to
> bind splits (shard) to the host where it is located. 
>
> I have come across InputSplitAssigner which gives me levers to
> influence compute colocation for batch queries. Is there a way to do
> the same for streaming queries as well? 
>
> Regards,
> Satyam

signature.asc
Description: OpenPGP digital signature

Re: Colocating Compute

Reply via email to