Combining streams with static data and using REST API as a sink

Josh Mon, 23 May 2016 06:22:38 -0700

Hi all,

I am new to Flink and have a couple of questions which I've had trouble
finding answers to online. Any advice would be much appreciated!


   1. What's a typical way of handling the scenario where you want to join
   streaming data with a (relatively) static data source? For example, if I
   have a stream 'orders' where each order has an 'item_id', and I want to
   join this stream with my database of 'items'. The database of items is
   mostly static (with perhaps a few new items added every day). The database
   can be retrieved either directly from a standard SQL database (postgres) or
   via a REST call. I guess one way to handle this would be to distribute the
   database of items with the Flink tasks, and to redeploy the entire job if
   the items database changes. But I think there's probably a better way to do
   it?
   2. I'd like my Flink job to output state to a REST API. (i.e. using the
   REST API as a sink). Updates would be incremental, e.g. the job would
   output tumbling window counts which need to be added to some property on a
   REST resource, so I'd probably implement this as a PATCH. I haven't found
   much evidence that anyone else has used a REST API as a Flink sink - is
   there a reason why this might be a bad idea?

Thanks for any advice on these,

Josh

Combining streams with static data and using REST API as a sink

Reply via email to