On 18 February 2010 23:12, sim <simon.cus...@gmail.com> wrote: > I have been using http-agents to grab a bunch of pages and then > process them, my initial solutions involved partitioning a sequence of > urls and then awaiting for that group before moving on.
Consider this alternative: (use 'clojure.contrib.duck-streams) (doall (apply pcalls (map #(fn [] (println (slurp* %))) ["http://slashdot.org" "http://www.reddit.com/"]))) Where println can be replaced with processing. > (I'm using clojure-1.1.0 and clojure-contrib-1.1.0) > > However the problem with that approach is that when one of the > responses is slow they all have to wait. To get around that I can > loop and query each running agent with done? etc, but that is a > hassle. > > I figured I'd use the :handler function instead, when the handler is > called I can deal with the response body and then fire off another > agent. However the agent's state hasn't changed internally to ::done > yet so not all the accessors work (like status, for example). I'm confused here whether you want the requests to happen in parallel or sequentially, or do you want some sort of mixture? > No problem I thought just queue another action behind the initial > http-agent request. As I am sending the actions from the same thread > they are guaranteed to run sequentially so my second action will have > access to the final agent state and then all the accessors will work, > etc. > > However there is another wrinkle; both string and result call await > and you can't await in an action. > > I want to do the following; > > (let [agnt (http-agent "http://slashdot.org/")] > (send agnt > (fn [state] > ;; printing for demonstration but could write to file, > post to another queue, etc. > (println "Done:" (done? *agent*) "Status:" (status > *agent*)) > ;; the following causes an error as string calls await > which is bad in an action > (println (string *agent*)) > state))) "If you don't provide a handler function, the default handler will buffer the entire response body in memory, which you can retrieve with the 'bytes', 'string', or 'stream' functions. Like 'result', these functions will block until the HTTP request is completed. A single GET request could be as simple as: (string (http-agent "http://www.stuartsierra.com/"))" so why do you have all the other stuff? If you wanted to use multiple http-agents in parallel you might want to specify a handler function that operates when the response is received. (ns foo (:require [clojure.contrib.http.agent :as a] [clojure.contrib.duck-streams :as d])) (a/http-agent "http://slashdot.org" :handler #(println (d/slurp* (a/stream %)))) (a/http-agent "http://www.reddit.com" :handler #(println (d/slurp* (a/stream %)))) Will do two url retrievals in parallel and print the results when they arrive. > This seems fairly idiomatic to me but I end up with agnt having the > "can't await in action" error. After I redefine result and string > (the only two that await in http-agent) to be the following; > > (in-ns clojure.contrib.http.agent) > > (defn string > "Returns the HTTP response body as a string, using the given > encoding. > > If no encoding is given, uses the encoding specified in the server > headers, or clojure.contrib.duck-streams/*default-encoding* if it is > not specified." > ([http-agnt] > ;; may have to wait for Content-Encoding > (if (not (done? http-agnt)) > (await http-agnt)) > > (string http-agnt (or (.getContentEncoding > #^HttpURLConnection (::connection @http- > agnt)) > duck/*default-encoding*))) > > ([http-agnt #^String encoding] > (.toString (get-byte-buffer http-agnt) encoding))) > > (defn result > "Returns the value returned by the :handler function of the HTTP > agent; blocks until the HTTP request is completed. The default > handler function returns a ByteArrayOutputStream." > [http-agnt] > (if (not (done? http-agnt)) > (await http-agnt)) > (::result @http-agnt)) > > Then everything is fine and the code above works. Interestingly this > doesn't change the documented behaviour of result as result still > blocks until complete but now only if it needs to which means I am > free to call result in a later action. > > Finally my question, is my usage of actions idiomatic and I have just > stumbled onto an unusual usage pattern for http-agent or is there some > reason that await is always called in string and result that I am > unaware of? Please let me know if I've misunderstood your aim here, but I think you can achieve your goals with a much simpler approach as described above. Regards, Tim. -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en