Thank you so much Didier for your detailed response! I will need some time
to digest it but a lot of what you write sounds very reasonable.
Thanks!
Brjánn
On 15 May 2018 at 02:57, Didier wrote:
> Oh, I forgot something important.
>
> If you're hoping to have multiple hosts, and run this application in a
> distributed way, you really should not do this that way. Things get a lot
> more complicated. The problem is, your request queue is local to a host. So
> if the client creates the Future on S1 on host A, and calls for
> get-s1-result, and he is routed to host B? That Future will be missing.
>
> So what you need is to turn that atom map of Futures into a distributed
> one. You could still have the Future atom map, but as the last step of each
> Future, you need to update the distributed map with the result or error.
> And if you want statuses, in your loop, you should also update it for
> status. So on get-s1-result, you just check the value of that distributed
> map. Each host still processes their own share of requests, but the
> distributed map exposes their result and processing status to all other
> hosts.
>
> There's many other ways to handle this issue. For example, I believe you
> can route the client to a direct connection to the particular host who
> handled S1, so that calls to get-s1-result go to that specific host. The
> downside is, it gets harder to evenly distribute the polls. Also, it takes
> more complex infrastructure to do that, all hosts must have their IPs
> exposed to the clients for example. Another way, is the VIP might be able
> to support smarter routing, based on some indicator, or you need to use a
> Master host, which delegates back, and has that logic itself.
>
> An alternate way, is to let go of the polling, and instead have a push
> model. Your server could call the client to tell it the request is handled.
> This also has its own complexities and trade offs.
>
> Anyways, in a distributed environment, async and non-blocking becomes
> quite a bit more complex.
>
>
>
> On Monday, 14 May 2018 17:35:39 UTC-7, Didier wrote:
>>
>> Its hard to answer without additional detail.
>>
>> I'll make some assumptions, and answer assuming those are true:
>>
>> 1) I assume your S1 API is blocking, and that each request to it is
>> handled on its own thread, and that those threads come form a fixed size
>> thread pool with a size of 30.
>>
>> 2) I assume that S2 is also blocking, and that it returns a promise when
>> you call it. And that you need to keep polling another API, that I'll call
>> get-s2-result which takes the promise, and is also blocking, and returns
>> the result, error, or that its still not available.
>>
>> 3) I assume you want to turn your blocking S1 API, into a pseudo
>> non-blocking behavior.
>>
>> 4) Thus, you would have S1 return a promise. When called, you do not
>> process the request, but you queue the request in a "to be processed"
>> queue, and you return a promise that eventually, the request will be
>> processed and will have a value, or an error.
>>
>> 5) Similarly, you need a way for the client to check the promise, thus
>> you will also expose a blocking API that I will call get-s1-result which
>> takes the promise and returns either the result, an error, or that it's not
>> available yet.
>>
>> 6) Your promise will take the form of a GUID that uniquely identifies the
>> queued request.
>>
>> 7) This is your APIs design. Your clients can now start work and
>> integration with your APIs, while you implement its functionality.
>>
>> 8) Now you need to implement the queuing up of requests. This is where
>> you have options, and core.async is one of them. I do agree with the advice
>> of not using core.async unless simpler tools don't work. So I will start
>> with a simpler tool: Future, and a global atom map from promise GUID to
>> request map.
>>
>> 9) So you create a global atom, which contains a map from GUID -> FUTURE.
>>
>> 10) On every request to S1, you create a new GUID and Future, and you
>> swap! assoc the GUID with the Future.
>>
>> 11) The Future is your request handler. So in it, you synchronously
>> handle the request, whatever that means for you. So maybe you do some
>> processing, and then you call S2, and then you loop, and every 100ms, in
>> the loop, you call get-s2-result until it returns an error or a result.
>> Every time you loop, you check that the time its been since the time you
>> started looping is not more then X timeout, so that you don't loop forever.
>> If you eventually get a result or an error, you handle them however you
>> need too, and eventually your future itself returns a result or an error.
>> Its important you design the future task to timeout eventually. So that you
>> don't leak futures that get stuck in infinite loops. So you must be able to
>> deterministically know that the future will finish.
>>
>> 12) Now you implement get-s1-result. Whenever it is called, you get the
>> future from the global atom map of fu