Hm, I came across a similar ordering invariant (No code called by a go block should ever call the blocking variants of core.async functions) while wrapping an imperative API, and I thought it might be useful to use vars/binding to enforce it. Has this or other approaches been considered in core.async? I could see a *fixed-thread-pool* var being set and >!! checking for false.
An analogy in existing clojure.core would be the STM commute's 'must be running in a transaction' check that uses a threadlocal. https://github.com/clojure/clojure/blob/master/src/jvm/clojure/lang/LockingTransaction.java#L205 On Tue, Aug 29, 2017 at 1:30 PM Timothy Baldridge <tbaldri...@gmail.com> wrote: > To add to what Alex said, look at this trace: > https://gist.github.com/anonymous/65049ffdd37d43df8f23630928e8fed0#file-thread-dump-out-L1337-L1372 > > Here we see a go block calling mapcat, and inside the inner map something > is calling >!!. As Alex mentioned this can be a source of deadlocks. No > code called by a go block should ever call the blocking variants of > core.async functions (<!!, >!!, alts!!, etc.). So I'd start at the code > redacted in those lines and go from there. > > > > On Tue, Aug 29, 2017 at 11:09 AM, Alex Miller <a...@puredanger.com> wrote: > >> go blocks are multiplexed over a thread pool which has (by default) 8 >> threads. You should never perform any kind of blocking activity inside a go >> block, because if every go block in work happens to end up blocked, you >> will prevent all go blocks from making any further progress. It sounds to >> me like that's what has happened here. The go block threads are named >> "async-dispatch-<n>" and it looks like there are 8 blocked ones in your >> thread dump. >> >> It also looks like they are all blocking on a >!!, which is a blocking >> call. So I would look for a go block that contains a >!! and convert that >> to a >! or do something else to avoid blocking there. >> >> >> On Tuesday, August 29, 2017 at 11:48:25 AM UTC-5, Aaron Iba wrote: >>> >>> My company has a production system that uses core.async extensively. >>> We've been running it 24/7 for over a year with occasional restarts to >>> update things and add features, and so far core.async has been working >>> great. >>> >>> The other day, during a particularly high workload, the whole system got >>> locked up. All the channels seemed blocked at once. I was able to connect >>> with a REPL and poke around, and noticed strange behavior of core.async. >>> Specifically, the following code, when evaluated in the REPL, blocked on >>> the put (third expression): >>> >>> (def c (async/chan)) >>> (go-loop [] >>> (when-some [x (<! c)] >>> (println x) >>> (recur))) >>> (>!! c true) >>> >>> Whereas on any fresh system, the above expressions obviously succeed. >>> >>> Puts succeeded if they went onto the channel's buffer, but not when they >>> should go through to a consumer. For example with the following >>> expressions, evaluated in the REPL, the first put succeeded (presumably >>> because it went on the buffer), but subsequent puts blocked: >>> >>> (def c (async/chan 1)) >>> (def m (async/mult c)) >>> (def out (async/chan (async/sliding-buffer 3))) >>> (async/tap m out) >>> (>!! c true) ;; succeeds >>> (>!! c true) ;; blocks forever >>> >>> This leads me to wonder if core.async itself somehow got into a bad >>> state. It's entirely possible I caused this by misusing the API somewhere >>> in the codebase, but we use core.async so extensively that I wouldn't know >>> where to begin looking. >>> >>> I'm wondering if someone more familiar with core.async internals has an >>> idea about what could cause the above situation. Or if we notice it >>> happening again, what could I do to gather more helpful information. >>> >>> I also have a redacted thread dump, in case it's useful: >>> >>> https://gist.github.com/anonymous/65049ffdd37d43df8f23630928e8fed0 >>> >>> Any help would be much appreciated, >>> >>> Aaron >>> >>> P.S. core.async has been a godsend in terms of helping us structure and >>> modularize our large system. Thank you to all those who contributed to >>> this wonderful library! >>> >>> -- >> You received this message because you are subscribed to the Google >> Groups "Clojure" group. >> To post to this group, send email to clojure@googlegroups.com >> Note that posts from new members are moderated - please be patient with >> your first post. >> To unsubscribe from this group, send email to >> clojure+unsubscr...@googlegroups.com >> For more options, visit this group at >> http://groups.google.com/group/clojure?hl=en >> --- >> You received this message because you are subscribed to the Google Groups >> "Clojure" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to clojure+unsubscr...@googlegroups.com. >> For more options, visit https://groups.google.com/d/optout. >> > > > > -- > “One of the main causes of the fall of the Roman Empire was that–lacking > zero–they had no way to indicate successful termination of their C > programs.” > (Robert Firth) > > -- > You received this message because you are subscribed to the Google > Groups "Clojure" group. > To post to this group, send email to clojure@googlegroups.com > Note that posts from new members are moderated - please be patient with > your first post. > To unsubscribe from this group, send email to > clojure+unsubscr...@googlegroups.com > For more options, visit this group at > http://groups.google.com/group/clojure?hl=en > --- > You received this message because you are subscribed to the Google Groups > "Clojure" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to clojure+unsubscr...@googlegroups.com. > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en --- You received this message because you are subscribed to the Google Groups "Clojure" group. To unsubscribe from this group and stop receiving emails from it, send an email to clojure+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.