Hi Sam, have you tried putting the incoming (hashtag,tweet) tuples into a queue and have another thread pull them out and upload them to couchdb?
I'm unfamiliar with HBC, but I assume it has a callback-based API, so you should be able to have multiple callbacks/connections/streams feed the same queue and have a single thread do the upload (and maybe batch if necessary). I don't see refs being a particularly good fit for this problem, but I could be wrong. 2014-12-11 16:18 GMT+00:00 Sam Raker <sam.ra...@gmail.com>: > I've got some code that's using Twitter's HoseBirdClient to pull tweets > from the public stream, which I then preprocess and store with CouchDB. > Right now, my HBC client is being forced to reconnect more than I'd like, > which occasionally causes my app to hang, for reasons I'm not entirely > clear on. Regardless, some preliminary research on HBC suggests that the > reconnections are being caused by my code failing to keep up with the > endpoint, which in turn suggests that my processing+uploading is taking too > long. I tried wrapping the processing+uploading part in futures, which > definitely sped things up, but caused 409 errors when uploading to > CouchDB--briefly, Couch requires any update operation to include a > git-style "rev" string, and if the rev you provide isn't the most recent > one, it throws a 409 at you. I'm organizing things by hashtag, so tweets > with multiple copies of the same hashtag, or series of tweets with the same > hashtag are the culprit--future A gets the current doc from Couch, > processes it, and uses the rev it got from the currently-existing doc, > while future B does the same thing, but finishes first, so now future A has > an outdated rev, and that causes the 409. > > The vague solution I've come up with involves using a map to store the rev > values, with the last step of the processing/uploading function being to > store the rev number Clutch helpfully returns to you after a successful > update. From what I can tell, refs are the way to go, since each future is > effectively a separate thread. My questions are as follows: > 1) Would I have to store the map-of-refs in a ref? > 2) Is this even feasible? Would the timing work out? > 3) With the addition of all this dereferencing and `dosync`+`alter`-ing, > would this actually end up speeding things up all that much? > > -- > You received this message because you are subscribed to the Google > Groups "Clojure" group. > To post to this group, send email to clojure@googlegroups.com > Note that posts from new members are moderated - please be patient with > your first post. > To unsubscribe from this group, send email to > clojure+unsubscr...@googlegroups.com > For more options, visit this group at > http://groups.google.com/group/clojure?hl=en > --- > You received this message because you are subscribed to the Google Groups > "Clojure" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to clojure+unsubscr...@googlegroups.com. > For more options, visit https://groups.google.com/d/optout. > -- László Török -- Checkout http://www.lollyrewards.com/ -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en --- You received this message because you are subscribed to the Google Groups "Clojure" group. To unsubscribe from this group and stop receiving emails from it, send an email to clojure+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.