Re: Asynchronous http poll

2018-05-15 Thread Brjánn Ljótsson
Thank you so much Didier for your detailed response! I will need some time
to digest it but a lot of what you write sounds very reasonable.

Thanks!

Brjánn

On 15 May 2018 at 02:57, Didier  wrote:

> Oh, I forgot something important.
>
> If you're hoping to have multiple hosts, and run this application in a
> distributed way, you really should not do this that way. Things get a lot
> more complicated. The problem is, your request queue is local to a host. So
> if the client creates the Future on S1 on host A, and calls for
> get-s1-result, and he is routed to host B? That Future will be missing.
>
> So what you need is to turn that atom map of Futures into a distributed
> one. You could still have the Future atom map, but as the last step of each
> Future, you need to update the distributed map with the result or error.
> And if you want statuses, in your loop, you should also update it for
> status. So on get-s1-result, you just check the value of that distributed
> map. Each host still processes their own share of requests, but the
> distributed map exposes their result and processing status to all other
> hosts.
>
> There's many other ways to handle this issue. For example, I believe you
> can route the client to a direct connection to the particular host who
> handled S1, so that calls to get-s1-result go to that specific host. The
> downside is, it gets harder to evenly distribute the polls. Also, it takes
> more complex infrastructure to do that, all hosts must have their IPs
> exposed to the clients for example. Another way, is the VIP might be able
> to support smarter routing, based on some indicator, or you need to use a
> Master host, which delegates back, and has that logic itself.
>
> An alternate way, is to let go of the polling, and instead have a push
> model. Your server could call the client to tell it the request is handled.
> This also has its own complexities and trade offs.
>
> Anyways, in a distributed environment, async and non-blocking becomes
> quite a bit more complex.
>
>
>
> On Monday, 14 May 2018 17:35:39 UTC-7, Didier wrote:
>>
>> Its hard to answer without additional detail.
>>
>> I'll make some assumptions, and answer assuming those are true:
>>
>> 1) I assume your S1 API is blocking, and that each request to it is
>> handled on its own thread, and that those threads come form a fixed size
>> thread pool with a size of 30.
>>
>> 2) I assume that S2 is also blocking, and that it returns a promise when
>> you call it. And that you need to keep polling another API, that I'll call
>> get-s2-result which takes the promise, and is also blocking, and returns
>> the result, error, or that its still not available.
>>
>> 3) I assume you want to turn your blocking S1 API, into a pseudo
>> non-blocking behavior.
>>
>> 4) Thus, you would have S1 return a promise. When called, you do not
>> process the request, but you queue the request in a "to be processed"
>> queue, and you return a promise that eventually, the request will be
>> processed and will have a value, or an error.
>>
>> 5) Similarly, you need a way for the client to check the promise, thus
>> you will also expose a blocking API that I will call get-s1-result which
>> takes the promise and returns either the result, an error, or that it's not
>> available yet.
>>
>> 6) Your promise will take the form of a GUID that uniquely identifies the
>> queued request.
>>
>> 7) This is your APIs design. Your clients can now start work and
>> integration with your APIs, while you implement its functionality.
>>
>> 8) Now you need to implement the queuing up of requests. This is where
>> you have options, and core.async is one of them. I do agree with the advice
>> of not using core.async unless simpler tools don't work. So I will start
>> with a simpler tool: Future, and a global atom map from promise GUID to
>> request map.
>>
>> 9) So you create a global atom, which contains a map from GUID -> FUTURE.
>>
>> 10) On every request to S1, you create a new GUID and Future, and you
>> swap! assoc the GUID with the Future.
>>
>> 11) The Future is your request handler. So in it, you synchronously
>> handle the request, whatever that means for you. So maybe you do some
>> processing, and then you call S2, and then you loop, and every 100ms, in
>> the loop, you call get-s2-result until it returns an error or a result.
>> Every time you loop, you check that the time its been since the time you
>> started looping is not more then X timeout, so that you don't loop forever.
>> If you eventually get a result or an error, you handle them however you
>> need too, and eventually your future itself returns a result or an error.
>> Its important you design the future task to timeout eventually. So that you
>> don't leak futures that get stuck in infinite loops. So you must be able to
>> deterministically know that the future will finish.
>>
>> 12) Now you implement get-s1-result. Whenever it is called, you get the
>> future from the global atom map of fu

Re: Asynchronous http poll

2018-05-15 Thread Didier
I think I want to simplify some things.

Normally, client/server async is implemented by the client/server framework. 
What happens is the interchange of messages between the client and server 
through the http connection on the socket is made non-blocking. But the entire 
request/response still happens within the context of a single http connection.

This often allows for the server to take in many more requests, and for the 
client to overlap other processing while waiting. That's because open IO 
operations are cheaper then open threads. So modern hardware support many more 
concurrent open connections on a socket then it does open threads.

If that's what you want, you need to move to a different client/server 
framework that supports non-blocking exchanges, such as Netty in the Java world.

If you want to avoid moving framework, or your operations are going to be 
really long, or you want to survive network drop outs. Then you can go for 
something more like what you were trying to go for.

In that case, you need to choose between push and pull.

If pull, you want a distributed map as I said, which is often known as a 
database. A sql table can do, a nosql key/value store also works. I'm a fan of 
DynamoDB for this. Ideally, you want your distributed map to have equal scaling 
capability as your server APIs will, otherwise it will become a bottleneck.

You can also go with a distributed queue, like RabbitMQ or AWS SQS. This allows 
the client to use a reactor evented response handling. Instead of having the 
client poll your get API to know if a response is availaible. You will put a 
message on the queue saying GUID-X is now done. And your client will work 
through the queue, and for every msg in it, it will poll you for the result.

If you want Push, you need the client to expose an endpoint to be contacted on 
when done. You can do this easily with AWS SNS for example. This could mean the 
client exposes a call-when-done(guid, result) API. It tells your server about 
it, and when you are done, you send a request to that API to notify the client. 
It allows the client to know right away that its done, and saves it the CPU 
work of having to poll. But it gets complicated if you fail to reach the 
client's endpoint, what happens? So with push, you can often miss a response. 
To avoid that, people often offer both pull and push.

So my practical recommendation to you would be to first look into a 
non-blocking client/server framework like Netty. Maybe that's all you need.

If not, then look into using DynamoDB or SQS or SNS or similar products.

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.