Thanks, I wasn't aware of those tools in the Python world.  At first 
glance, things like pyutilib.workflow look interesting: I like that they're 
they're explicitly separating the definitions of functions from the wiring 
of inputs and outputs. I wonder if, in the clojure world (and wider JVM 
world), people just tend to approach that problem domain in a different 
way. Or if most people doing that kind of processing have large enough 
datasets that Cascalog is a natural fit. 

Anyway, thanks for the helpful pointer -- I'll do some more searching down 
the "workflow management system" avenue as well. 


On Monday, August 20, 2012 8:14:23 PM UTC-4, Leif wrote:
>
> +1.  I know of a couple tools in python for this purpose that are called 
> "workflow management systems."   It would be good to know if there is a 
> robust one in clojure.
>
> On Monday, August 20, 2012 12:18:54 AM UTC-4, matt hoffman wrote:
>>
>> I have a problem that I'm trying to figure out how to tackle. I'm new to 
>> Clojure, but I'm interested, and perhaps this will be my excuse to give it 
>> a try. Any of the following answers would help:
>> "What you're describing really sounds like X"
>> "You could think of that problem like this, instead"
>> "You may want to search for term 'Y'...it sounds related" (I imagine I'm 
>> probably describing some well-established domain...I just don't know the 
>> right terms to search for)
>>
>> So, the problem:
>> I have an app that is in production doing some fairly complex 
>> calculations on large-ish (terabyte-range) amounts of data.  The 
>> calculations are expressed as chains of dependent tasks, where each tasks 
>> can have a number of inputs and outputs. But the code has become hard to 
>> maintain, full of accidental complexity and very difficult for newer 
>> developers to understand. So, I'm trying to find the right abstractions to 
>> put in place to keep things simple. 
>> One of the sources of complexity is the intermingling of code involving 
>> loading data, dividing up data to be executed in parallel, processing data, 
>> persisting data, and handling the execution flow on an individual datum 
>> (configuring pipelines of components,etc.) I'd like to keep the functions 
>> pure and push the other concerns off to a framework -- and, ideally, not 
>> have to write that framework. 
>>
>> So I think my problem statement is this: 
>> I'd like to be able to define functions that specify, somehow, what input 
>> they want, and perhaps what output they produce. Then I'd like to push the 
>> concern of how those inputs are calculated -- loaded from a db, calculated 
>> from source data -- off on some other party. 
>>
>> For example, if I define a function that requires "foo", and I call that 
>> function without providing "foo", I'd like for _something_ to step in and 
>> say, "Ok, you require foo. I have this function over here that produces 
>> foo. Let me call that for you, then hand you the output."  Perhaps instead 
>> of a framework that transparently looks up and executes that function and 
>> provides a Future for the result, perhaps I can explicitly build a 
>> dependency graph up-front containing all the functions required to produce 
>> the end result, and then execute them all in order... I think the effect is 
>> the same. 
>>
>> From a bit of searching I've done today, dataflow programming like 
>> clojure.contrib.dataflow sounds like it might be close to what I'm looking 
>> for, but I'd love to hear ideas.   Am I describing something that already 
>> exists?  Would this actually be simpler than it seems using some clever 
>> macros? Are there some keywords I should search for to get started?  Or 
>> perhaps I'm coming at this problem wrong, and I should think about it a 
>> different way...
>>
>>

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Reply via email to