Are you able to advise any further Timo? Thanks!

On 2020/02/04 16:10:04, Stephen Young <wintersg...@googlemail.com> wrote: 
> Hi Timo,
> 
> Thanks for replying to me so quickly!
> 
> We could do it with insert-only rows. When you say flags in the data do you 
> mean a field with a name like 'retracts' and then the value of that field is 
> the id of the event/row we want to retract? How would that be possible with 
> Flink?
> 
> Thanks!
> 
> On 2020/02/04 15:27:20, Timo Walther <twal...@apache.org> wrote: 
> > Hi Stephan,
> > 
> > the use cases you are describing sound like a perfect fit to Flink. 
> > Internally, Flink deals with insertions and deletions that are flowing 
> > through the system and can update chained aggregations and complex queries.
> > 
> > The only bigger limitation at the moment is that we only support sources 
> > that emit insert-only rows. The community is currently working on 
> > designing how we expose the internal changelog processing capabilities 
> > through our APIs.
> > 
> > However, your use case might also work with insert-only rows and a query 
> > based on the flags in the data, correct?
> > 
> > Regards,
> > Timo
> > 
> > 
> > On 04.02.20 16:14, Stephen Young wrote:
> > > I am currently looking into how Flink can support a live data collection 
> > > platform. We want to collect certain data in real-time. This data will be 
> > > sent to Kafka and we want to use Flink to calculate statistics and 
> > > derived events from it.
> > > 
> > > An important thing we need to be able to handle is amendment or deletion 
> > > events. For example, we may get an event that someone has performed an 
> > > action and from this we'd calculate how many of these actions they had 
> > > taken in total. We'd also build calculations on top of that, for example 
> > > top 10 rankings by these counts, or arbitrarily many layers of 
> > > calculations beyond that. But sometime later (this could be a few seconds 
> > > or a week) we receive an amendment event to that action. This indicates 
> > > that the action was taken by a different person or from a different 
> > > location. We then need Flink to recalculate all of our downstream stats 
> > > i.e. the counts need to be changed and rankings need to be adjusted.
> > > 
> > >>From my research into Flink I can see there is a page about Dynamic 
> > >>Tables and also there was some stuff about retraction support for the 
> > >>Table/SQL API. But it seems like this is simply how Flink models changes 
> > >>to aggregated data. I would like to be able to do something like 
> > >>calculate a count from a set of events each with their own id, then 
> > >>retract one of those events by its id and have the count automatically 
> > >>change.
> > > 
> > > Is anything like this achievable with Flink? Thanks!
> > > 
> > 
> > 
> 

Reply via email to