Hi Avi,

you could use Flink's broadcast state pattern [1]. You would need to use
the DataStream API but it allows you to have two streams (input and control
stream) where the control stream is broadcasted to all sub tasks. So by
ingesting messages into the control stream you can send model updates to
all sub tasks.

[1]
https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/state/broadcast_state.html

Cheers,
Till

On Tue, Jan 1, 2019 at 6:49 PM miki haiat <miko5...@gmail.com> wrote:

> Im trying to understand  your  use case.
> What is the source  of the data ? FS ,KAFKA else ?
>
>
> On Tue, Jan 1, 2019 at 6:29 PM Avi Levi <avi.l...@bluevoyant.com> wrote:
>
>> Hi,
>> I have a list (couple of thousands text lines) that I need to use in my
>> map function. I read this article about broadcasting variables
>> <https://ci.apache.org/projects/flink/flink-docs-stable/dev/batch/#broadcast-variables>
>>  or
>> using distributed cache
>> <https://ci.apache.org/projects/flink/flink-docs-stable/dev/batch/#distributed-cache>
>> however I need to update this list from time to time, and if I understood
>> correctly it is not possible on broadcast or cache without restarting the
>> job. Is there idiomatic way to achieve this? A db seems to be an overkill
>> for that and I do want to be cheap on io/network calls as much as possible.
>>
>> Cheers
>> Avi
>>
>>

Reply via email to