Hi David, interesting use case, I think, this can be nicely done with a comap. Let me know if you run into problems, unfortunately I am not aware of any open source examples.
Cheers, Konstnatin On 22.03.2016 21:07, David Brelloch wrote: > Konstantin, > > For now the jobs will largely just involve incrementing or decrementing > based on the json message coming in. We will probably look at adding > windowing later but for now that isn't a huge priority. > > As an example of what we are looking to do lets say the following > 3 message were read from the kafka topic: > {"customerid": 1, "event": "addAdmin"} > {"customerid": 1, "event": "addAdmin"} > {"customerid": 1, "event": "deleteAdmin"} > > If the customer with id of 1 had said they care about that type of > message we would expect to be tracking the number of admins and notify > them that they currently have 2. The events are obviously much more > complicated than that and they are not uniform but that is the general > overview. > > I will take a look at using the comap operator. Do you know of any > examples where it is doing something similar? Quickly looking I am not > seeing it used anywhere outside of tests where it is largely just > unifying the data coming in. > > I think accumulators will at least be a reasonable starting place for us > so thank your for pointing me in that direction. > > Thanks for your help! > > David > > On Tue, Mar 22, 2016 at 3:27 PM, Konstantin Knauf > <konstantin.kn...@tngtech.com <mailto:konstantin.kn...@tngtech.com>> wrote: > > Hi David, > > I have no idea how many parallel jobs are possible in Flink, but > generally speaking I do not think this approach will scale, because you > will always only have one job manager for coordination. But there is > definitely someone on the list, who can tell you more about this. > > Regarding your 2nd question. Could you go into some more details, what > the jobs will do? Without knowing any details, I think a control kafka > topic which contains the "job creation/cancellation requests" of the > users in combination with a comap-operator is the better solution here. > You could keep the currently active "jobs" as state in the comap and and > emit one record of the original stream per active user-job together with > some indicator on how to process it based on the request. What are your > concerns with respect to insight in the process? I think with some nice > accumulators you could get a good idea of what is going on, on the other > hand if I think about monitoring 1000s of jobs I am actually not so > sure ;) > > Cheers, > > Konstantin > > On 22.03.2016 19 <tel:22.03.2016%2019>:16, David Brelloch wrote: > > Hi all, > > > > We are currently evaluating flink for processing kafka messages and are > > running into some issues. The basic problem we are trying to solve is > > allowing our end users to dynamically create jobs to alert based off the > > messages coming from kafka. At launch we figure we need to support at > > least 15,000 jobs (3000 customers with 5 jobs each). I have the example > > kafka job running and it is working great. The questions I have are: > > > > 1. On my local machine (admittedly much less powerful than we > would be > > using in production) things fall apart once I get to around 75 jobs. > > Can flink handle a situation like this where we are looking at > > thousands of jobs? > > 2. Is this approach even the right way to go? Is there a different > > approach that would make more sense? Everything will be listening to > > the same kafka topic so the other thought we had was to have 1 job > > that processed everything and was configured by a separate control > > kafka topic. The concern we had there was we would almost completely > > lose insight into what was going on if there was a slow down. > > 3. The current approach we are using for creating dynamic jobs is > > building a common jar and then starting it with the configuration > > data for the individual job. Does this sound reasonable? > > > > > > If any of these questions are answered elsewhere I apologize. I > couldn't > > find any of this being discussed elsewhere. > > > > Thanks for your help. > > > > David > > -- > Konstantin Knauf * konstantin.kn...@tngtech.com > <mailto:konstantin.kn...@tngtech.com> * +49-174-3413182 > <tel:%2B49-174-3413182> > TNG Technology Consulting GmbH, Betastr. 13a, 85774 Unterföhring > Geschäftsführer: Henrik Klagges, Christoph Stock, Dr. Robert Dahlke > Sitz: Unterföhring * Amtsgericht München * HRB 135082 > > -- Konstantin Knauf * konstantin.kn...@tngtech.com * +49-174-3413182 TNG Technology Consulting GmbH, Betastr. 13a, 85774 Unterföhring Geschäftsführer: Henrik Klagges, Christoph Stock, Dr. Robert Dahlke Sitz: Unterföhring * Amtsgericht München * HRB 135082