Restructuring with your tip now, Michael, thank you! --- Oytun Tez
*M O T A W O R D* The World's Fastest Human Translation Platform. oy...@motaword.com — www.motaword.com On Fri, Feb 22, 2019 at 11:23 AM Michael Latta <mla...@technomage.com> wrote: > You may want to union the 3 streams prior to the process function if they > are independently processed. > > > Michael > > On Feb 22, 2019, at 9:15 AM, Oytun Tez <oy...@motaword.com> wrote: > > Hi everyone! > > I've been struggling with an implementation problem in the last days, > which I am almost sure caused by my misunderstanding of Flink. > > The purpose: consume multiple streams, update a score list (with meta data > e.g. user_id) for each update coming from any of the streams. The new > output list will also need to be used by another pattern. > > 1. We created 3 SourceFunctions, that periodically go to our MySQL > database and stream new results back. This one returns POJOs. > 2. Then we flatMap these streams to unify their Type. They are now all > Tuple3s with matching types. > 3. And we process each stream with the same ProcessFunction. > 4. I am stuck with the output list. > > Business case (human translation workflow): > > 1. Input: Stream "translation quality" score updates of each > translator [translator_id, score] > 2. Input: Stream "responsivity score" updates of each translator > (email open rates/speeds etc) [translator_id, score] > 3. Input: Stream "number of projects" updates each translator worked > on [translator_id, score] > 4. Calculation: for each translator, use 3 scores to come up with a > unified score and its percentile over all translators. This step definitely > feels like a Batch job, but I am pushing to go with a streaming mindset. > 5. So now supposedly, in this way or another, I have a list of > translators with their unified score and percentile over this list. > 6. Another independent stream should send me updates on "need for > proofreaders" – I couldn't even come to this point yet. Once a need info is > streamed, application would fetch the previously calculated list and let's > say picks the top X determined by the message from need algorithm. > > > <image.png> > > Overall, my desire is to make everything a stream and let the data and > decisions constantly react to stream updates. I am very confused at this > point. Tried using keyed and operator states, but they seem to be keeping > their state only for their own items. Considering to do Batch instead after > all the struggle. > > Any ideas? I can even get on a call. > > > > > > > > > > > > > > > --- > Oytun Tez > > *M O T A W O R D* > The World's Fastest Human Translation Platform. > oy...@motaword.com — www.motaword.com > >