Hi Si-li,

couldn't you also add the timestamp as a state to the source? So the time
would store the timestamp of the last emitted record.
It's nearly identical to your solution but would fit the recovery model of
Flink much better.
If you want to go further back to account for the records that have been
actually processed in the join, you could also replay the data from <last
timestamp> - <some offset>.

On Mon, Nov 16, 2020 at 8:39 AM Si-li Liu <unix...@gmail.com> wrote:

> Thanks, I'll try it.
>
> Matthias Pohl <matth...@ververica.com> 于2020年11月14日周六 上午12:53写道:
>
>> Hi Si-li,
>> trying to answer your initial question: Theoretically, you could try
>> using the co-location constraints to achieve this. But keep in mind that
>> this might lead to multiple Join operators running in the same JVM reducing
>> the amount of memory each operator can utilize.
>>
>> Best,
>> Matthias
>>
>> On Mon, Nov 9, 2020 at 4:23 AM Si-li Liu <unix...@gmail.com> wrote:
>>
>>> Thanks for your reply.
>>>
>>> It's a streaming job. The join operator is doing join work, such as
>>> join. The join state is too large so I don't want to keep the state using
>>> the mechanism that Flink provided, and also I don't need very precise join.
>>> So I prefer to let the join operator to calculate a backward timestamp as
>>> state, if the cluster restarts, the consumer can use setStartFromTimestamp
>>> to start from that timestamp.
>>>
>>> Now my problem is, consumer can't read the state that join operator
>>> written, so I need a way to need small message (64bit long) from downstream
>>> to upstream. Redis may be a solution, but add external  dependency is a
>>> secondary option if I can pass this message through memory.
>>>
>>>
>>> Chesnay Schepler <ches...@apache.org> 于2020年11月6日周五 上午7:06写道:
>>>
>>>> It would be good if you could elaborate a bit more on your use-case.
>>>> Are you using batch or streaming? What kind of "message" are we talking
>>>> about? Why are you thinking of using a static variable, instead of just
>>>> treating this message as part of the data(set/stream)?
>>>>
>>>> On 11/5/2020 12:55 PM, Si-li Liu wrote:
>>>>
>>>> Currently I use Flink 1.9.1. The actual thing I want to do is send some
>>>> messages from downstream operators to upstream operators, which I consider
>>>> use static variable.
>>>>
>>>> But it makes me have to make sure in one taskmanager process it always
>>>> has these two operators, can I use CoLocationGroup to solve this problem?
>>>> Or can anyone give me an example to demostrate the usage of CoLocationGroup
>>>> ?
>>>>
>>>> Thanks!
>>>> --
>>>> Best regards
>>>>
>>>> Sili Liu
>>>>
>>>>
>>>>
>>>
>>> --
>>> Best regards
>>>
>>> Sili Liu
>>>
>>
>
> --
> Best regards
>
> Sili Liu
>


-- 

Arvid Heise | Senior Java Developer

<https://www.ververica.com/>

Follow us @VervericaData

--

Join Flink Forward <https://flink-forward.org/> - The Apache Flink
Conference

Stream Processing | Event Driven | Real Time

--

Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany

--
Ververica GmbH
Registered at Amtsgericht Charlottenburg: HRB 158244 B
Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji
(Toni) Cheng

Reply via email to