Advice or best practices on adding metadata to stream events

Hauke Hans Mon, 29 Jan 2018 02:05:46 -0800

Hi everyone,

I am fairly new to the world of stream processing and I was wondering about 
best practices when needing to add metadata to a stream in Flink (or stream 
processing in general). Searching for examples/discussions of this topic did 
not yield the results I was hoping for, so I figured should try asking here.


Imagine the following (fictional) use case: 

We have a stream of events, let's say some kind of transaction events (as in 
buyer/seller). Let's also say that we have different "types" of sellers which 
have a specific pricing model, which is manually changed up to several times a 
day by account managers. These pricing models are saved in a SQL database. I 
now want to build a streaming application with Flink that is showing the 
current turnover rate per seller in a N-minute window. For this purpose I need 
to know which pricing model needs to be applied for a given event in the stream 
when processing it. 

My naive first idea would be to simply fire an sql query on the invocation of 
my WindowFunctions apply method (probably with some caching). Would this be a 
reasonable thing to do? It kind of feels wrong to me. Or is there a more 
'streamy' kind of way? I could imagine somehow turning the metadata Database 
into a stream source and then joining both streams, but this seems a lot more 
involved than the previous idea. Or am I maybe approaching the problem the 
completely wrong way? As I said, I'm very new to the whole stream processing 
thing. 

I would be super glad if anyone could point me to any resources discussing a 
use case like this, or share your experience/opinion on this topic! 

Best regards,
Hauke

Advice or best practices on adding metadata to stream events

Reply via email to