Thanks, I wasn't aware of those tools in the Python world. At first glance, things like pyutilib.workflow look interesting: I like that they're they're explicitly separating the definitions of functions from the wiring of inputs and outputs. I wonder if, in the clojure world (and wider JVM world), people just tend to approach that problem domain in a different way. Or if most people doing that kind of processing have large enough datasets that Cascalog is a natural fit.
Anyway, thanks for the helpful pointer -- I'll do some more searching down the "workflow management system" avenue as well. On Monday, August 20, 2012 8:14:23 PM UTC-4, Leif wrote: > > +1. I know of a couple tools in python for this purpose that are called > "workflow management systems." It would be good to know if there is a > robust one in clojure. > > On Monday, August 20, 2012 12:18:54 AM UTC-4, matt hoffman wrote: >> >> I have a problem that I'm trying to figure out how to tackle. I'm new to >> Clojure, but I'm interested, and perhaps this will be my excuse to give it >> a try. Any of the following answers would help: >> "What you're describing really sounds like X" >> "You could think of that problem like this, instead" >> "You may want to search for term 'Y'...it sounds related" (I imagine I'm >> probably describing some well-established domain...I just don't know the >> right terms to search for) >> >> So, the problem: >> I have an app that is in production doing some fairly complex >> calculations on large-ish (terabyte-range) amounts of data. The >> calculations are expressed as chains of dependent tasks, where each tasks >> can have a number of inputs and outputs. But the code has become hard to >> maintain, full of accidental complexity and very difficult for newer >> developers to understand. So, I'm trying to find the right abstractions to >> put in place to keep things simple. >> One of the sources of complexity is the intermingling of code involving >> loading data, dividing up data to be executed in parallel, processing data, >> persisting data, and handling the execution flow on an individual datum >> (configuring pipelines of components,etc.) I'd like to keep the functions >> pure and push the other concerns off to a framework -- and, ideally, not >> have to write that framework. >> >> So I think my problem statement is this: >> I'd like to be able to define functions that specify, somehow, what input >> they want, and perhaps what output they produce. Then I'd like to push the >> concern of how those inputs are calculated -- loaded from a db, calculated >> from source data -- off on some other party. >> >> For example, if I define a function that requires "foo", and I call that >> function without providing "foo", I'd like for _something_ to step in and >> say, "Ok, you require foo. I have this function over here that produces >> foo. Let me call that for you, then hand you the output." Perhaps instead >> of a framework that transparently looks up and executes that function and >> provides a Future for the result, perhaps I can explicitly build a >> dependency graph up-front containing all the functions required to produce >> the end result, and then execute them all in order... I think the effect is >> the same. >> >> From a bit of searching I've done today, dataflow programming like >> clojure.contrib.dataflow sounds like it might be close to what I'm looking >> for, but I'd love to hear ideas. Am I describing something that already >> exists? Would this actually be simpler than it seems using some clever >> macros? Are there some keywords I should search for to get started? Or >> perhaps I'm coming at this problem wrong, and I should think about it a >> different way... >> >> -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en