Are you able to advise any further Timo? Thanks!
On 2020/02/04 16:10:04, Stephen Young <wintersg...@googlemail.com> wrote: > Hi Timo, > > Thanks for replying to me so quickly! > > We could do it with insert-only rows. When you say flags in the data do you > mean a field with a name like 'retracts' and then the value of that field is > the id of the event/row we want to retract? How would that be possible with > Flink? > > Thanks! > > On 2020/02/04 15:27:20, Timo Walther <twal...@apache.org> wrote: > > Hi Stephan, > > > > the use cases you are describing sound like a perfect fit to Flink. > > Internally, Flink deals with insertions and deletions that are flowing > > through the system and can update chained aggregations and complex queries. > > > > The only bigger limitation at the moment is that we only support sources > > that emit insert-only rows. The community is currently working on > > designing how we expose the internal changelog processing capabilities > > through our APIs. > > > > However, your use case might also work with insert-only rows and a query > > based on the flags in the data, correct? > > > > Regards, > > Timo > > > > > > On 04.02.20 16:14, Stephen Young wrote: > > > I am currently looking into how Flink can support a live data collection > > > platform. We want to collect certain data in real-time. This data will be > > > sent to Kafka and we want to use Flink to calculate statistics and > > > derived events from it. > > > > > > An important thing we need to be able to handle is amendment or deletion > > > events. For example, we may get an event that someone has performed an > > > action and from this we'd calculate how many of these actions they had > > > taken in total. We'd also build calculations on top of that, for example > > > top 10 rankings by these counts, or arbitrarily many layers of > > > calculations beyond that. But sometime later (this could be a few seconds > > > or a week) we receive an amendment event to that action. This indicates > > > that the action was taken by a different person or from a different > > > location. We then need Flink to recalculate all of our downstream stats > > > i.e. the counts need to be changed and rankings need to be adjusted. > > > > > >>From my research into Flink I can see there is a page about Dynamic > > >>Tables and also there was some stuff about retraction support for the > > >>Table/SQL API. But it seems like this is simply how Flink models changes > > >>to aggregated data. I would like to be able to do something like > > >>calculate a count from a set of events each with their own id, then > > >>retract one of those events by its id and have the count automatically > > >>change. > > > > > > Is anything like this achievable with Flink? Thanks! > > > > > > > >