No, I can’t. I have no previous experience with Python. According to the 
examples, Stateful Functions in Python should already be async. So, an async 
API/DB client should do the job while being used properly.

Regarding your scenarios:

API DB service down
From my experience with Java SDK, there is no difference in identifying 
unavailability with the code running outside of the Statefun runtime. You can 
handle an exception, verify status code, or whatnot. Then either replace the 
result with the fallback answer or schedule retries. Or fail.

Timeouts/Retry
The API calling process usually detects timeouts. You should only ensure that 
call timeouts are not bigger than the timeouts of your module. Retries are more 
complicated. If you use exponential backoff, do a few attempts, fail, and 
schedule a delayed message for the next cycle. When the message arrives, 
restart your attempts with slightly increased backoff parameters. This way, 
your service won’t waste resources such as thread, connection, and hardware 
waiting for the next attempt. Also, you will be safe from exceeding the 
function’s invocation timeout. Don’t do regular interval backoff spanned over a 
long time. They must finish before the Statefun runtime fails function 
invocation due to exceeded timeout.

Best,
Tymur Yarosh
On 16 Aug 2022, 16:20 +0300, Himanshu Sareen <himanshusar...@outlook.com>, 
wrote:
> Thanks, Tymor. I think the Datastream API job will also work for the 
> use-case. I'll try a quick POC.
>
> Further, Can you suggest how should we fetch data from External API/DB from a 
> Remote Stateful Function ( Pythion sdk ).
> We have a Flink-Statefun application which listens to Kafka Streams and 
> populate/hydrate them with data ( hosted on external DB/Servers ).
>
> Presently we are making API/DB call from within Stateful Function. But how 
> should we handle the following scenarios:
>
> 1. API/DB service down
> 2. Timeouts/Re-try
>
>
>
> Regards,
> Himanshu
>
>
>
> Get Outlook for Android
> From: Tymur Yarosh <ti.yar...@gmail.com>
> Sent: Tuesday, August 16, 2022 5:22:23 PM
> To: user@flink.apache.org <user@flink.apache.org>; Himanshu Sareen 
> <himanshusar...@outlook.com>
> Subject: Re: Flink-Statefun : Design Solution advice ( ASYNC IO)
>
> Ok, I don't see it as a good case for Stateful Functions. Statefun is a 
> fantastic tool when you treat each function as an object with its data and 
> behavior, and these objects communicate with each other in arbitrary 
> directions. Also, Stateful Functions remote API is asynchronous, so you can 
> use it to enrich some data.
>
> While I have a limited understanding of your problem, it looks like a classic 
> DAG that can be implemented using pure Flink DataStream API.
>
> For instance:
> 1. Kafka source ->
> 2. keyBy your id ->
> 3. FlatMap to split Kafka messages to message per id and also attach the 
> total count of messages to each downstream message ->
> 4. RichAsyncMessage to enrich id and count with the data from Cassandra ->
> 5. process with a function that:
> 5.1. Sends incoming messages downstream and increases the count of seen 
> messages
> 5.2. Compares the count of seen messages to the total count. If they equal 
> sends ack to the side output and clears the state
> 5.3. If there are some more messages expected and there were nothing during 
> some threshold duration, sends an error to another side output
> 6. Three sinks — one for enriched outgoing messages, the other one for acks 
> from side output, and the last one for errors.
>
> Or you can even combine acks and errors to the same sink.
>
> Works for you?
>
> Best,
> Tymur Yarosh
> 16 серп. 2022 р., 13:40 +0300, Himanshu Sareen <himanshusar...@outlook.com>, 
> писав(-ла):
> > Hi Tymur,
> >
> > 1. Why do you want a remote function to call an embedded function in this 
> > case?
> >
> >     To Implement an Async IO call. 
> > https://nightlies.apache.org/flink/flink-statefun-docs-master/docs/sdk/flink-datastream/#completing-async-requests
> >
> > 2. Do you have separate outgoing Kafka messages or join them per record?
> >
> >     1 DB query will fetch Millions records which will be aggregated into 
> > multiple separate events/messages to be published to Kafka.
> >
> > 3. What is explicit acknowledgment at the end for?
> >
> >      As 1 Ingress event may have > 500 IDs. So once entire events has been 
> > processed ( i.e. DB query is completed for all 500 IDs ) we have to send a 
> > Trigger Event for Down-Stream consumers.
> >
> >
> > Also If there is better way to code this problem do share some pointers.
> >
> > Regards,
> > Himanshu
> > From: Tymur Yarosh <ti.yar...@gmail.com>
> > Sent: Tuesday, August 16, 2022 2:37 PM
> > To: user@flink.apache.org <user@flink.apache.org>; Himanshu Sareen 
> > <himanshusar...@outlook.com>
> > Subject: Re: Flink-Statefun : Design Solution advice ( ASYNC IO)
> >
> > Hi,
> > Just a few questions:
> > 1. Why do you want a remote function to call an embedded function in this 
> > case?
> > 2. Do you have separate outgoing Kafka messages or join them per record?
> > 3. What is explicit acknowledgment at the end for?
> >
> > Best,
> > Tymur Yarosh
> > 16 ÓÅÒÐ. 2022 Ò., 11:48 +0300, Himanshu Sareen 
> > <himanshusar...@outlook.com>, ÐÉÓÁ×(-ÌÁ):
> > > Team,
> > >
> > > I'm solving a use-case and needs advice/suggestions if Flink is right 
> > > choice for solution.
> > >
> > > 1. Ingress - Kafka events/messages consist of multiple IDs.
> > > 2. Task - For each ID in a Kafka message query Cassandra DB ( 
> > > asynchronously) to fetch records. Prepare multiple small messages out of 
> > > the records received.
> > > 3. Egress - Emit all the small messages/event to Kafka.
> > >
> > > Ingress Event may contain more than 500 IDs
> > > Each ID may result into Millions of Records from Cassandra DB.
> > >
> > >
> > > My Design Approach
> > >
> > > 1. Remote Stateful Function listens to Ingress.
> > > 2. Invokes Embedded function ( which implements ASYNC IO ) for each ID.
> > > 3. Once records are available from Embedded Function, process them and 
> > > emits multiple events to Kafka.
> > > 4. Send back an acknowledgement to calling remote Function.
> > >
> > > Please share suggestions or advice.
> > >
> > > Note - I'm unable to find a good example for embedded and remote Function 
> > > interaction.
> > >
> > > Regards

Reply via email to