Hi Gyula, Thanks a lot. That's really help a lot !
Have a great vacation Cheers On Fri, Aug 21, 2015 at 7:47 PM, Gyula Fóra <gyula.f...@gmail.com> wrote: > Hi > > You are right, if all operators need continuous updates than the most > straightforward way is to push all the updates to the operators like you > described. > > If the cached data is the same for all operators and is small enough you > can centralize the updates in a dedicated operator and push the updated > data to the operators every once in a while. > > Cheers > Gyula > > > > On Thu, Aug 20, 2015 at 4:31 PM Welly Tambunan <if05...@gmail.com> wrote: > >> Hi Gyula, >> >> I have a couple of operator on the pipeline. Filter, mapper, flatMap, and >> each of these operator contains some cache data. >> >> So i think that means for every other operator on the pipeline, i will >> need to add a new stream to update each cache data. >> >> >> Cheers >> >> On Thu, Aug 20, 2015 at 5:33 PM, Gyula Fóra <gyula.f...@gmail.com> wrote: >> >>> Hi, >>> >>> I don't think I fully understand your question, could you please try to >>> be a little more specific? >>> >>> I assume by caching you mean that you keep the current model as an >>> operator state. Why would you need to add new streams in this case? >>> >>> I might be slow to answer as I am currently on vacation without stable >>> internet connection. >>> >>> Cheers, >>> Gyula >>> >>> On Thu, Aug 20, 2015 at 5:36 AM Welly Tambunan <if05...@gmail.com> >>> wrote: >>> >>>> Hi Gyula, >>>> >>>> I have another question. So if i cache something on the operator, to >>>> keep it up to date, i will always need to add and connect another stream >>>> of changes to the operator ? >>>> >>>> Is this right for every case ? >>>> >>>> Cheers >>>> >>>> On Wed, Aug 19, 2015 at 3:21 PM, Welly Tambunan <if05...@gmail.com> >>>> wrote: >>>> >>>>> Hi Gyula, >>>>> >>>>> That's really helpful. The docs is improving so much since the last >>>>> time (0.9). >>>>> >>>>> Thanks a lot ! >>>>> >>>>> Cheers >>>>> >>>>> On Wed, Aug 19, 2015 at 3:07 PM, Gyula Fóra <gyula.f...@gmail.com> >>>>> wrote: >>>>> >>>>>> Hey, >>>>>> >>>>>> If it is always better to check the events against a more up-to-date >>>>>> model (even if the events we are checking arrived before the update) then >>>>>> it is fine to keep the model outside of the system. >>>>>> >>>>>> In this case we need to make sure that we can push the updates to the >>>>>> external system consistently. If you are using the PersistentKafkaSource >>>>>> for instance it can happen that some messages are replayed in case of >>>>>> failure. In this case you need to make sure that you remove duplicate >>>>>> updates or have idempotent updates. >>>>>> >>>>>> You can read about the checkpoint mechanism in the Flink website: >>>>>> https://ci.apache.org/projects/flink/flink-docs-master/internals/stream_checkpointing.html >>>>>> >>>>>> Cheers, >>>>>> Gyula >>>>>> >>>>>> On Wed, Aug 19, 2015 at 9:56 AM Welly Tambunan <if05...@gmail.com> >>>>>> wrote: >>>>>> >>>>>>> Thanks Gyula, >>>>>>> >>>>>>> Another question i have.. >>>>>>> >>>>>>> > ... while external model updates would be *tricky *to keep >>>>>>> consistent. >>>>>>> Is that still the case if the Operator treat the external model as >>>>>>> read-only ? We create another stream that will update the external model >>>>>>> separately. >>>>>>> >>>>>>> Could you please elaborate more about this one ? >>>>>>> >>>>>>> Cheers >>>>>>> >>>>>>> On Wed, Aug 19, 2015 at 2:52 PM, Gyula Fóra <gyula.f...@gmail.com> >>>>>>> wrote: >>>>>>> >>>>>>>> In that case I would apply a map to wrap in some common type, like >>>>>>>> a n Either<t1,t2> before the union. >>>>>>>> >>>>>>>> And then in the coflatmap you can unwrap it. >>>>>>>> On Wed, Aug 19, 2015 at 9:50 AM Welly Tambunan <if05...@gmail.com> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> Hi Gyula, >>>>>>>>> >>>>>>>>> Thanks. >>>>>>>>> >>>>>>>>> However update1 and update2 have a different type. Based on my >>>>>>>>> understanding, i don't think we can use union. How can we handle this >>>>>>>>> one ? >>>>>>>>> >>>>>>>>> We like to create our event strongly type to get the domain >>>>>>>>> language captured. >>>>>>>>> >>>>>>>>> >>>>>>>>> Cheers >>>>>>>>> >>>>>>>>> On Wed, Aug 19, 2015 at 2:37 PM, Gyula Fóra <gyula.f...@gmail.com> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> Hey, >>>>>>>>>> >>>>>>>>>> One input of your co-flatmap would be model updates and the other >>>>>>>>>> input would be events to check against the model if I understand >>>>>>>>>> correctly. >>>>>>>>>> >>>>>>>>>> This means that if your model updates come from more than one >>>>>>>>>> stream you need to union them into a single stream before connecting >>>>>>>>>> them >>>>>>>>>> with the event stream and applying the coatmap. >>>>>>>>>> >>>>>>>>>> DataStream updates1 = .... >>>>>>>>>> DataStream updates2 = .... >>>>>>>>>> DataStream events = .... >>>>>>>>>> >>>>>>>>>> events.connect(updates1.union(updates2).broadcast()).flatMap(...) >>>>>>>>>> >>>>>>>>>> Does this answer your question? >>>>>>>>>> >>>>>>>>>> Gyula >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Wednesday, August 19, 2015, Welly Tambunan <if05...@gmail.com> >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>>> Hi Gyula, >>>>>>>>>>> >>>>>>>>>>> Thanks for your response. >>>>>>>>>>> >>>>>>>>>>> However the model can received multiple event for update. How >>>>>>>>>>> can we do that with co-flatmap as i can see the connect API only >>>>>>>>>>> received >>>>>>>>>>> single datastream ? >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> > ... while external model updates would be tricky to keep >>>>>>>>>>> consistent. >>>>>>>>>>> Is that still the case if the Operator treat the external model >>>>>>>>>>> as read-only ? We create another stream that will update the >>>>>>>>>>> external model >>>>>>>>>>> separately. >>>>>>>>>>> >>>>>>>>>>> Cheers >>>>>>>>>>> >>>>>>>>>>> On Wed, Aug 19, 2015 at 2:05 PM, Gyula Fóra <gyf...@apache.org> >>>>>>>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>>> Hey! >>>>>>>>>>>> >>>>>>>>>>>> I think it is safe to say that the best approach in this case >>>>>>>>>>>> is creating a co-flatmap that will receive updates on one input. >>>>>>>>>>>> The events >>>>>>>>>>>> should probably be broadcasted in this case so you can check in >>>>>>>>>>>> parallel. >>>>>>>>>>>> >>>>>>>>>>>> This approach can be used effectively with Flink's checkpoint >>>>>>>>>>>> mechanism, while external model updates would be tricky to keep >>>>>>>>>>>> consistent. >>>>>>>>>>>> >>>>>>>>>>>> Cheers, >>>>>>>>>>>> Gyula >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On Wed, Aug 19, 2015 at 8:44 AM Welly Tambunan < >>>>>>>>>>>> if05...@gmail.com> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Hi All, >>>>>>>>>>>>> >>>>>>>>>>>>> We have a streaming computation that required to validate the >>>>>>>>>>>>> data stream against the model provided by the user. >>>>>>>>>>>>> >>>>>>>>>>>>> Right now what I have done is to load the model into flink >>>>>>>>>>>>> operator and then validate against it. However the model can be >>>>>>>>>>>>> updated and >>>>>>>>>>>>> changed frequently. Fortunately we always publish this event to >>>>>>>>>>>>> RabbitMQ. >>>>>>>>>>>>> >>>>>>>>>>>>> I think we can >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> 1. Create RabbitMq listener for model changed event from >>>>>>>>>>>>> inside the operator, then update the model if event arrived. >>>>>>>>>>>>> >>>>>>>>>>>>> But i think this will create race condition if not handle >>>>>>>>>>>>> correctly and it seems odd to keep this >>>>>>>>>>>>> >>>>>>>>>>>>> 2. We can move the model into external in external memory >>>>>>>>>>>>> cache storage and keep the model up to date using flink. So >>>>>>>>>>>>> the operator >>>>>>>>>>>>> will retrieve that from memory cache >>>>>>>>>>>>> >>>>>>>>>>>>> 3. Create two stream and using co operator for managing >>>>>>>>>>>>> the shared state. >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> What is your suggestion on keeping the state up to date from >>>>>>>>>>>>> external event ? Is there some kind of best practice for >>>>>>>>>>>>> maintaining model >>>>>>>>>>>>> up to date on streaming operator ? >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks a lot >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Cheers >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> -- >>>>>>>>>>>>> Welly Tambunan >>>>>>>>>>>>> Triplelands >>>>>>>>>>>>> >>>>>>>>>>>>> http://weltam.wordpress.com >>>>>>>>>>>>> http://www.triplelands.com <http://www.triplelands.com/blog/> >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> Welly Tambunan >>>>>>>>>>> Triplelands >>>>>>>>>>> >>>>>>>>>>> http://weltam.wordpress.com >>>>>>>>>>> http://www.triplelands.com <http://www.triplelands.com/blog/> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> Welly Tambunan >>>>>>>>> Triplelands >>>>>>>>> >>>>>>>>> http://weltam.wordpress.com >>>>>>>>> http://www.triplelands.com <http://www.triplelands.com/blog/> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Welly Tambunan >>>>>>> Triplelands >>>>>>> >>>>>>> http://weltam.wordpress.com >>>>>>> http://www.triplelands.com <http://www.triplelands.com/blog/> >>>>>>> >>>>>> >>>>> >>>>> >>>>> -- >>>>> Welly Tambunan >>>>> Triplelands >>>>> >>>>> http://weltam.wordpress.com >>>>> http://www.triplelands.com <http://www.triplelands.com/blog/> >>>>> >>>> >>>> >>>> >>>> -- >>>> Welly Tambunan >>>> Triplelands >>>> >>>> http://weltam.wordpress.com >>>> http://www.triplelands.com <http://www.triplelands.com/blog/> >>>> >>> >> >> >> -- >> Welly Tambunan >> Triplelands >> >> http://weltam.wordpress.com >> http://www.triplelands.com <http://www.triplelands.com/blog/> >> > -- Welly Tambunan Triplelands http://weltam.wordpress.com http://www.triplelands.com <http://www.triplelands.com/blog/>