Sounds like a good idea. because in the control stream the time doesn't
really matters. Thanks !!!

On Fri, Jan 4, 2019 at 11:13 AM David Anderson <da...@da-platform.com>
wrote:

> Another solution to the watermarking issue is to write an
> AssignerWithPeriodicWatermarks for the control stream that always returns
> Watermark.MAX_WATERMARK as the current watermark. This produces watermarks
> for the control stream that will effectively be ignored.
>
> On Thu, Jan 3, 2019 at 9:18 PM Avi Levi <avi.l...@bluevoyant.com> wrote:
>
>> Thanks for the tip Elias!
>>
>> On Wed, Jan 2, 2019 at 9:44 PM Elias Levy <fearsome.lucid...@gmail.com>
>> wrote:
>>
>>> One thing you must be careful of, is that if you are using event time
>>> processing, assuming that the control stream will only receive messages
>>> sporadically, is that event time will stop moving forward in the operator
>>> joining the streams while the control stream is idle.  You can get around
>>> this by using a periodic watermark extractor one the control stream that
>>> bounds the event time delay to processing time or by defining your own low
>>> level operator that ignores watermarks from the control stream.
>>>
>>> On Wed, Jan 2, 2019 at 8:42 AM Avi Levi <avi.l...@bluevoyant.com> wrote:
>>>
>>>> Thanks Till I will defiantly going to check it. just to make sure that
>>>> I got you correctly. you are suggesting the the list that I want to
>>>> broadcast will be broadcasted via control stream and it will be than be
>>>> kept in the relevant operator state correct ? and updates (CRUD) on that
>>>> list will be preformed via the control stream. correct ?
>>>> BR
>>>> Avi
>>>>
>>>> On Wed, Jan 2, 2019 at 4:28 PM Till Rohrmann <trohrm...@apache.org>
>>>> wrote:
>>>>
>>>>> Hi Avi,
>>>>>
>>>>> you could use Flink's broadcast state pattern [1]. You would need to
>>>>> use the DataStream API but it allows you to have two streams (input and
>>>>> control stream) where the control stream is broadcasted to all sub tasks.
>>>>> So by ingesting messages into the control stream you can send model 
>>>>> updates
>>>>> to all sub tasks.
>>>>>
>>>>> [1]
>>>>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__ci.apache.org_projects_flink_flink-2Ddocs-2Dstable_dev_stream_state_broadcast-5Fstate.html&d=DwMFaQ&c=euGZstcaTDllvimEN8b7jXrwqOf-v5A_CdpgnVfiiMM&r=dpWtkT5FJRWFqDA3MAnB4-dRYGDQjgfQTYAocqGkRKo&m=u5UQh821Gau2wZ7S3M8IRmVpL5JxGADJaq_k7iq6sYo&s=uITdFlQPKLbqxkTux4nR21JhUpLIkS5Pdfi9D_ZSUwE&e=>
>>>>> https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/state/broadcast_state.html
>>>>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__ci.apache.org_projects_flink_flink-2Ddocs-2Dstable_dev_stream_state_broadcast-5Fstate.html&d=DwQFaQ&c=euGZstcaTDllvimEN8b7jXrwqOf-v5A_CdpgnVfiiMM&r=dpWtkT5FJRWFqDA3MAnB4-dRYGDQjgfQTYAocqGkRKo&m=u5UQh821Gau2wZ7S3M8IRmVpL5JxGADJaq_k7iq6sYo&s=uITdFlQPKLbqxkTux4nR21JhUpLIkS5Pdfi9D_ZSUwE&e=>
>>>>>
>>>>> Cheers,
>>>>> Till
>>>>>
>>>>> On Tue, Jan 1, 2019 at 6:49 PM miki haiat <miko5...@gmail.com> wrote:
>>>>>
>>>>>> Im trying to understand  your  use case.
>>>>>> What is the source  of the data ? FS ,KAFKA else ?
>>>>>>
>>>>>>
>>>>>> On Tue, Jan 1, 2019 at 6:29 PM Avi Levi <avi.l...@bluevoyant.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Hi,
>>>>>>> I have a list (couple of thousands text lines) that I need to use in
>>>>>>> my map function. I read this article about broadcasting variables
>>>>>>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__ci.apache.org_projects_flink_flink-2Ddocs-2Dstable_dev_batch_-23broadcast-2Dvariables&d=DwMFaQ&c=euGZstcaTDllvimEN8b7jXrwqOf-v5A_CdpgnVfiiMM&r=dpWtkT5FJRWFqDA3MAnB4-dRYGDQjgfQTYAocqGkRKo&m=u5UQh821Gau2wZ7S3M8IRmVpL5JxGADJaq_k7iq6sYo&s=U3vGeHdL9fGDfP0GNZUkGpSlcVLz9CNLg2MXNwHP0_M&e=>
>>>>>>>  or
>>>>>>> using distributed cache
>>>>>>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__ci.apache.org_projects_flink_flink-2Ddocs-2Dstable_dev_batch_-23distributed-2Dcache&d=DwMFaQ&c=euGZstcaTDllvimEN8b7jXrwqOf-v5A_CdpgnVfiiMM&r=dpWtkT5FJRWFqDA3MAnB4-dRYGDQjgfQTYAocqGkRKo&m=u5UQh821Gau2wZ7S3M8IRmVpL5JxGADJaq_k7iq6sYo&s=m5IHbX1Dbz7AYERvVgyxKXmrUQQ06IkA4VCDllkR0HM&e=>
>>>>>>> however I need to update this list from time to time, and if I 
>>>>>>> understood
>>>>>>> correctly it is not possible on broadcast or cache without restarting 
>>>>>>> the
>>>>>>> job. Is there idiomatic way to achieve this? A db seems to be an 
>>>>>>> overkill
>>>>>>> for that and I do want to be cheap on io/network calls as much as 
>>>>>>> possible.
>>>>>>>
>>>>>>> Cheers
>>>>>>> Avi
>>>>>>>
>>>>>>>

Reply via email to