Compojure-api, spec-tools, swagger,

2018-05-14 Thread Dave Tenny
I'm using compojure-api 2.0.0-alpha19, spec-tools 0.6.1, and trying to 
generate swagger pages with :spec coercion in compojure-api.

I know this is a work in progress but I'm not an expert in any of these 
tools and thought I'd ask here before I write pesky and likely incorrect 
issues in the Metosin repos.  I've done a fair bit of prismatic-schema 
enabled compojure-api and so was trying to translate some of that into a 
spec-ized implementation.  However i've either made a ton of mistakes 
(quite likely), or there's a lot of bugs, I'm not sure.

The following code fragments and annotated image show some of the problems 
I'm seeing.  Advice welcome. 

Basic specs, later aliased via 'db-jobs':

(s/def ::job-id nat-int?)
(s/def ::job-type #{:this-job :that-job :the-other-job})

Specs built on the above, later aliased via 'specs' in compojure-api 
declarations:
(s/def ::job-id  
  (st/spec ::db-jobs/job-id
   {:description "Specifies a Job ID, must be accompanied by a Firm 
ID for any valid use."}))


(s/def ::job-type 
  (st/spec ::db-jobs/job-type
   {:description "Specifies the type of job to be dispatched to a 
suitable worker service node."}))


(s/def ::firm-id
  (st/spec nat-int? 
   {:description "Specifies a Firm ID in the service database, or 
zero if there there is no specific firm."}))

Compojure-api code using the above three specs:

...
  (:require
   [clojure.spec.alpha :as s]
   [compojure.api.core :as api-core]
   [compojure.api.sweet :as sweet]
   [compojure.route :as route]
   [my.specs :as specs]
   [spec-tools.core :as st]

...
(sweet/defroutes routes
  (sweet/POST "/job/:firm-id" []
{:summary  "Enqueue a job"
 :description  "Enqueue a job."
 :path-params [firm-id :- ::specs/firm-id]
 :body-params [job-type :- ::specs/job-type]
 :responses   {201 {:description "Job enqueued." :schema ::specs/job-id}
   400 {:description "Invalid job type." 
:schema (st/spec string? {:description "Value of 
the unsupported job-type argument."})
}}}
{:status 201 :body (impl/create-job! firm-id job-type)}))

...
  (sweet/api
   {:coercion :spec
:swagger {:data {;;:basePath "/"
 :info {:title "Job Scheduler"
:description "Jobs"
:version 1
:contact {:name  "Pizza Eaters Inc." ...}}}
  :ui "/docs/swagger"
  :spec "/docs/swagger.json"
  :options {:ui {:docExpansion "list"

The resulting swagger page (fragment) follows, annotated for discussion:




   1. For the Response Class presentation, it would be nice to get the 
   description shown with the ::job-id type.
   2. The 'job-type' parameter name is not shown.
   3. The description of the 'job-type' parameter is not shown.
   4. The 201 response code data is missing from Response messages.
   5. (not highlighted)   :firm-id works as expected, whether it's because 
   it's a :path-param or because it lacks the same degree of indirection in 
   its spec definition I don't know.


Also note that
:path-params [firm-id :- (sweet/describe ::specs/firm-id "Firm identifier, 
or zero if there is no firm.")]
Does not produce a description in swagger, however the spec.tools code in 
the initial compojure-api code I presented works.  Bug or feature? (Seems 
like sweet/describe might be deprecated as we move to clojure.spec if we're 
using spec-tools).


Anyway, tips appreciated, and patience for obvious "user" mistakes.

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Asynchronous http poll

2018-05-14 Thread Didier
Its hard to answer without additional detail.

I'll make some assumptions, and answer assuming those are true:

1) I assume your S1 API is blocking, and that each request to it is handled 
on its own thread, and that those threads come form a fixed size thread 
pool with a size of 30.

2) I assume that S2 is also blocking, and that it returns a promise when 
you call it. And that you need to keep polling another API, that I'll call 
get-s2-result which takes the promise, and is also blocking, and returns 
the result, error, or that its still not available.

3) I assume you want to turn your blocking S1 API, into a pseudo 
non-blocking behavior.

4) Thus, you would have S1 return a promise. When called, you do not 
process the request, but you queue the request in a "to be processed" 
queue, and you return a promise that eventually, the request will be 
processed and will have a value, or an error.

5) Similarly, you need a way for the client to check the promise, thus you 
will also expose a blocking API that I will call get-s1-result which takes 
the promise and returns either the result, an error, or that it's not 
available yet.

6) Your promise will take the form of a GUID that uniquely identifies the 
queued request.

7) This is your APIs design. Your clients can now start work and 
integration with your APIs, while you implement its functionality.

8) Now you need to implement the queuing up of requests. This is where you 
have options, and core.async is one of them. I do agree with the advice of 
not using core.async unless simpler tools don't work. So I will start with 
a simpler tool: Future, and a global atom map from promise GUID to request 
map.

9) So you create a global atom, which contains a map from GUID -> FUTURE.

10) On every request to S1, you create a new GUID and Future, and you swap! 
assoc the GUID with the Future.

11) The Future is your request handler. So in it, you synchronously handle 
the request, whatever that means for you. So maybe you do some processing, 
and then you call S2, and then you loop, and every 100ms, in the loop, you 
call get-s2-result until it returns an error or a result. Every time you 
loop, you check that the time its been since the time you started looping 
is not more then X timeout, so that you don't loop forever. If you 
eventually get a result or an error, you handle them however you need too, 
and eventually your future itself returns a result or an error. Its 
important you design the future task to timeout eventually. So that you 
don't leak futures that get stuck in infinite loops. So you must be able to 
deterministically know that the future will finish.

12) Now you implement get-s1-result. Whenever it is called, you get the 
future from the global atom map of futures, and you call future-done? on 
it. If false, you return that the result is not available yet. If it is 
done, you deref the future, swap! dessoc the mapEntry for it from your 
global atom, and return the result or error.

The only danger of this approach, is that the Future queue is unbounded. So 
what happens is that clients can call S1 and get-s1-result with at most 30 
concurrent request. That's because I assumed your APIs are blocking and 
bounded on a shared fixed thread pool of size 30.

Now say it takes you 1 second to process on average an S1 request, so your 
future will finish on average in 1 second, and you time them out at 5 
seconds. Now say we go for worst case scenario. This means say S2 is down, 
so all requests take the max of 5 seconds to be handled. Now say your 
clients are also maxing out your concurrency for S1, so you get around 30 
concurrent request constantly. Say S1 takes 100ms to return the promise. 
What you get is this:

* Every second, you are creating 300 Future, because every 100ms, you 
process 30 new S1 requests.

So say we are at the beginning, you have 0 Future, one second later, you 
have 300, 5 second later, you have 1500, but your first 300 timeout, so you 
end up with 1200. At the 6th second, you have 1200 again, since 300 more 
were queued, but 300 more timed out, and from this point on, every second 
you have 1200 open Futures, with a max of 1500.

Thus you need to make sure you can handle 1500 open threads on your host.

Indirectly, this stabilizes because you made sure your Future tasks time 
out at 5 second, and because your S1 API is itself bounded to 30 concurrent 
request max.

If, you'd prefer to not rely on the bound of the S1 requests, and you have 
a hard time knowing the timing of your S1, you can keep track of the count 
of queued Future, and on a request to S1 where the count is above your 
bound, you return an error, instead of a promise, asking the client to wait 
a bit, and retry the call in a bit, where you have more resourced available.

I hope this helps.

On Tuesday, 8 May 2018 13:45:00 UTC-7, Brjánn Ljótsson wrote:
>
> Hi!
>
> I'm writing a server-side (S1) function that initiates an action on 
> another server (S2) 

Re: Asynchronous http poll

2018-05-14 Thread Didier
Oh, I forgot something important.

If you're hoping to have multiple hosts, and run this application in a 
distributed way, you really should not do this that way. Things get a lot 
more complicated. The problem is, your request queue is local to a host. So 
if the client creates the Future on S1 on host A, and calls for 
get-s1-result, and he is routed to host B? That Future will be missing.

So what you need is to turn that atom map of Futures into a distributed 
one. You could still have the Future atom map, but as the last step of each 
Future, you need to update the distributed map with the result or error. 
And if you want statuses, in your loop, you should also update it for 
status. So on get-s1-result, you just check the value of that distributed 
map. Each host still processes their own share of requests, but the 
distributed map exposes their result and processing status to all other 
hosts.

There's many other ways to handle this issue. For example, I believe you 
can route the client to a direct connection to the particular host who 
handled S1, so that calls to get-s1-result go to that specific host. The 
downside is, it gets harder to evenly distribute the polls. Also, it takes 
more complex infrastructure to do that, all hosts must have their IPs 
exposed to the clients for example. Another way, is the VIP might be able 
to support smarter routing, based on some indicator, or you need to use a 
Master host, which delegates back, and has that logic itself.

An alternate way, is to let go of the polling, and instead have a push 
model. Your server could call the client to tell it the request is handled. 
This also has its own complexities and trade offs.

Anyways, in a distributed environment, async and non-blocking becomes quite 
a bit more complex.


On Monday, 14 May 2018 17:35:39 UTC-7, Didier wrote:
>
> Its hard to answer without additional detail.
>
> I'll make some assumptions, and answer assuming those are true:
>
> 1) I assume your S1 API is blocking, and that each request to it is 
> handled on its own thread, and that those threads come form a fixed size 
> thread pool with a size of 30.
>
> 2) I assume that S2 is also blocking, and that it returns a promise when 
> you call it. And that you need to keep polling another API, that I'll call 
> get-s2-result which takes the promise, and is also blocking, and returns 
> the result, error, or that its still not available.
>
> 3) I assume you want to turn your blocking S1 API, into a pseudo 
> non-blocking behavior.
>
> 4) Thus, you would have S1 return a promise. When called, you do not 
> process the request, but you queue the request in a "to be processed" 
> queue, and you return a promise that eventually, the request will be 
> processed and will have a value, or an error.
>
> 5) Similarly, you need a way for the client to check the promise, thus you 
> will also expose a blocking API that I will call get-s1-result which takes 
> the promise and returns either the result, an error, or that it's not 
> available yet.
>
> 6) Your promise will take the form of a GUID that uniquely identifies the 
> queued request.
>
> 7) This is your APIs design. Your clients can now start work and 
> integration with your APIs, while you implement its functionality.
>
> 8) Now you need to implement the queuing up of requests. This is where you 
> have options, and core.async is one of them. I do agree with the advice of 
> not using core.async unless simpler tools don't work. So I will start with 
> a simpler tool: Future, and a global atom map from promise GUID to request 
> map.
>
> 9) So you create a global atom, which contains a map from GUID -> FUTURE.
>
> 10) On every request to S1, you create a new GUID and Future, and you 
> swap! assoc the GUID with the Future.
>
> 11) The Future is your request handler. So in it, you synchronously handle 
> the request, whatever that means for you. So maybe you do some processing, 
> and then you call S2, and then you loop, and every 100ms, in the loop, you 
> call get-s2-result until it returns an error or a result. Every time you 
> loop, you check that the time its been since the time you started looping 
> is not more then X timeout, so that you don't loop forever. If you 
> eventually get a result or an error, you handle them however you need too, 
> and eventually your future itself returns a result or an error. Its 
> important you design the future task to timeout eventually. So that you 
> don't leak futures that get stuck in infinite loops. So you must be able to 
> deterministically know that the future will finish.
>
> 12) Now you implement get-s1-result. Whenever it is called, you get the 
> future from the global atom map of futures, and you call future-done? on 
> it. If false, you return that the result is not available yet. If it is 
> done, you deref the future, swap! dessoc the mapEntry for it from your 
> global atom, and return the result or error.
>
> The only danger of this approach,