If the external web service call does not modify the state of that external 
system all the approaches you list are probably ok.  If there is external state 
modification then you want to ensure on restart the Flink job does not resend 
requests to that service or that it can handle duplicate requests.  In that 
sense a sink that sends the request is the cleanest as it represents an export 
of data to an external system.  The response back is just to allow the sink to 
not repeat messages.  If it is sending data to the system that affects it’s 
state and then the response has values that need to be recorded as results not 
just control values, then that could be a separate flow or use the map process 
as from the Flink’s point of view it was a transform.  This later case however 
to me smacks of an undesireable side effect as these make error recovery cases 
harder.

Michael

> On Apr 28, 2018, at 8:21 PM, wazza <wazzawe...@gmail.com> wrote:
> 
> Hi all,
> 
> I need to send a request to an external web service and then store the 
> response in a DB table, and I am wondering how people have approached this or 
> similar problems in the past.
> 
> The flow is: Kafka source (msgs only every few seconds)  => filter/map 
> operators => result sent to web service (which updates state in that system) 
> => response stored in DB.
> 
> Initially I was thinking of just creating a custom sink which basically: 
> Sends request to webservice  => Get response containing external key => Save 
> key into DB
> This feels to me like basically smashing together 2 separate sinks into 1, 
> and I am not sure if that is a good design or not.
> 
> Another option would be to create a RichMapFunction (possibly async function) 
> which does the web service call. My map function can then just return the 
> response which I can then feed into a standard DB sink. 
> However, with this approach it feels strange to update an external system in 
> a map() function, but maybe that's ok? Also, I presume to make my map 
> function idempotent I would need to store some state (I can key the messages 
> and use a ValueState) so I don't do duplicate web service calls if there is a 
> failure?
> 
> Thoughts?
> 
> Thanks in advance.
> 

Reply via email to