Very good points, Timothy! On Wed, Feb 3, 2016 at 7:45 AM, Timothy Baldridge <tbaldri...@gmail.com> wrote:
> I find this subject interesting as I was just discussing this with a > co-worker recently. There's a few points I'd like to make: > > Firstly, data is often a form of a DSL (domain specific language). > Libraries like Onyx often contain (as Lucas mentioned) a parser that walks > the data and performs some actions based on that. That's also known as a > evaluator. These libraries also often optimize the data by composing > functions or emitting clojure code, aka...a compiler. > > So when we say that something is "fully data driven", we have to realize > that we are in essence writing a language. A language with a familiar > syntax, but a language with different semantics. Documenting those > semantics is critical. > > So why not just write in code to begin with? Well often we wish to > pragmatically manipulate the inputs to these libraries before execution. So > we want our language to be in a format that is easy to manipulate. "Why not > lisp code?" you may ask. Well that's often a question about ease of > processing. > > This code is easy to read: > > (if x :foo :bar) > > But this code is easier to process programmatically: > > {:op :if > :children [:test :then :else] > :test {:op :local :name 'x} > :then {:op :const :val :foo} > :else {:op :const :val :bar}} > > > I never really want to hand-write the latter, but I don't want to write a > program to analyze the former. > > So, all that is a round-about way of saying my preferred pattern is the > following: > > 1) Write my library using functions and immutable data for all inputs, > preferably also without positional arguments, each function takes one or > more maps. Positional arguments are hard to emit programatically. > > 2) Write helper functions to allow users to construct data for my system > using the APIs from #1, these will basically generate data from positional > arguments. > > 3) If needed write macros and DSLs to parse/emit data from user-friendly > data inputs, into my data DSL format. > > 4) If needed, optimize performance by writing DSL "compilers" or emitting > records/protocols. > > > In short, configure your code with data, make your data palatable with > code. > > Timothy > > > On Wed, Feb 3, 2016 at 2:04 AM, <lucas.bradstr...@onyxplatform.org> wrote: > >> Hi Josh, >> >> I am one of the core Onyx developers, so I am biased in some respects. >> I'm going to only speak to specific advantages that code > data gives Onyx. >> >> An advantage with Onyx is the ability to build up your jobs dynamically >> using data that is easily transformable by code, using all of the functions >> that you use in clojure e.g. conj, assoc, update, get, etc. Data in clojure >> is far more easily manipulated by core functions than XML, ensuring that >> you can do things like build up a job from a base system, add arbitrary >> numbers and types of tasks, parameters, lifecycles, and options to your job >> for different purposes. >> >> This ensures that Onyx is very flexible - complex jobs do not have to be >> simply stored in lengthy static EDN files, they can be built by code from >> job to job, depending on your needs. To give an example, imagine a case >> where you wanted to load data from an arbitrary number of queue >> datasources, and an input plugin only allows a single queue name to be read >> from in a single task - you can easily transform your job's workflow and >> catalog to expand out an arbitrary number of tasks to read from these >> queues, annotating the input data with the queue name, all directed at >> another task that you define. If you wish to sometimes add some debugging >> metrics, you can do so by transforming the job, etc. If tasks within a job >> are not the correct level of granularity, you could instead dynamically >> build multiple jobs and submit them all to the cluster. >> >> Mike brings up a good point around performance concerns around data > >> code. With respect to Onyx, the "dataness" of Onyx jobs is very often >> compiled down to records and more performant representations. This ensures >> that the dataness at the user level isn't lost, while ensuring performance >> for the common case. In some ways you can think of the data in the Onyx job >> as the AST for the Onyx job, which is validated, and then compiled for >> performance. It would be quite easily to build code over this data which >> ensured you never had to touch the data that defined the job, especially >> since the core code functionality, a task's onyx/fn, is a plain clojure >> function, operating using whatever Clojure/Java objects you want. We don't >> think this is generally a good idea, but you have the ability should you >> need it. >> >> When these jobs are submitted to the cluster, this data is serialized and >> stored in ZooKeeper, to be read back by the cluster for scheduling >> purposes. This data is human readable when viewed in a dashboard, usable in >> ClojureScript (even allowing jobs to be built and dispatched by web clients >> - at which point you may need a data representation anyway), or >> transformable e.g. if you inspect a previous job's end state and data in >> order to migrate between jobs. >> >> By defining an information model and documentation around the core data >> representation, we can easily present specific documentation to users when >> their jobs fail schema validation for any reason, see >> https://github.com/onyx-platform/onyx/blob/0.8.x/src/onyx/information_model.cljc >> for the model and documentation map that we use for error messages, and how >> we have additionally leveraged this information model to build a >> ClojureScript page that is a handy reference guide for users >> http://www.onyxplatform.org/docs/cheat-sheet/latest/#/trigger-entry. >> >> Hopefully this answers some of your questions around why I like this >> technique for Onyx, even if I didn't answer your overarching question. >> >> Cheers, >> >> Lucas >> >> >> On Tuesday, February 2, 2016 at 6:02:23 AM UTC+8, Josh Tilles wrote: >>> >>> As I’m watching Michael Drogalis’s Clojure/Conj 2015 presentation >>> “Onyx: Distributed Computing for Clojure” >>> <https://youtube.com/watch?v=YlfA8hFs2HY&t=734>, I'm distracted by a >>> nagging worry that we —as a community— are somehow falling into the same >>> trap as the those advocating XML in the early 2000s. That said, it's a very >>> *vague* unease, because I don’t know much about why the industry seems >>> to have rejected XML as “bad”; by the time I started programming >>> professionally there was already a consensus that XML sucked, and that >>> libraries/frameworks that relied heavily on XML configuration files were to >>> be regarded with suspicion and/or distaste. >>> >>> So, am I incorrect in seeing a similarity between the “data > code” >>> mentality and the rise of XML? Or, assuming there is a legitimate >>> parallel, is it perhaps unnecessary to be alarmed? Does the tendency to use >>> edn instead of XML sidestep everything that went wrong in the 2000s? Or is >>> it the case that the widespread backlash against XML threw a baby out with >>> the bathwater, forgetting the advantages of data over code? >>> >>> Cheers, >>> Josh >>> >> -- >> You received this message because you are subscribed to the Google >> Groups "Clojure" group. >> To post to this group, send email to clojure@googlegroups.com >> Note that posts from new members are moderated - please be patient with >> your first post. >> To unsubscribe from this group, send email to >> clojure+unsubscr...@googlegroups.com >> For more options, visit this group at >> http://groups.google.com/group/clojure?hl=en >> --- >> You received this message because you are subscribed to the Google Groups >> "Clojure" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to clojure+unsubscr...@googlegroups.com. >> For more options, visit https://groups.google.com/d/optout. >> > > > > -- > “One of the main causes of the fall of the Roman Empire was that–lacking > zero–they had no way to indicate successful termination of their C > programs.” > (Robert Firth) > > -- > You received this message because you are subscribed to the Google > Groups "Clojure" group. > To post to this group, send email to clojure@googlegroups.com > Note that posts from new members are moderated - please be patient with > your first post. > To unsubscribe from this group, send email to > clojure+unsubscr...@googlegroups.com > For more options, visit this group at > http://groups.google.com/group/clojure?hl=en > --- > You received this message because you are subscribed to the Google Groups > "Clojure" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to clojure+unsubscr...@googlegroups.com. > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en --- You received this message because you are subscribed to the Google Groups "Clojure" group. To unsubscribe from this group and stop receiving emails from it, send an email to clojure+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.