On Tuesday, June 12, 2012 2:18:03 AM UTC-6, Christophe Grand wrote: > > Hi, > > To contrast our experiences of the language and the different approaches > to deal with some problems: > > On Sun, Jun 10, 2012 at 4:47 AM, Kurt Harriger <kurtharri...@gmail.com>wrote: > >> Many will say that side-effecting functions are more difficult to test >> then pure functions... However after writing about 4000 lines of clojure >> code, I realized that things in practice are never quite as simple as they >> seem. As functions are composed the data structures they work with grow >> larger and more complex and this leads to maps containing maps containing >> lists containing maps and a minor change downstream can ripple through the >> program. Tests become significantly more complex and fragile as the input >> and output structures grow in complexity. >> > > Do you test only the functions or do you have also introduced "lint" > functions which check the shape of your data. To me, these are pretty > useful: you can use them in tests, pre/postconds, middlewares to guard > against untrusted sources etc. > > I do use pre-conditions where the test condition is simple, ie is this a string? does the map have a :field-type? however I get a lot of my input data from http requests as json which have similar structures but different semantics, so I often do not have preconditions where type is not explicit. For example a string could be an list id or a contact id or an encoded json document. While it is possible to try to parse a string for json to verify its type this is seems very computationally expensive and therefore usually inferred from the context.
I also felt that explicit type checking went against the spirit of duck typing, there was a couple of times I added type checks only to realize that the type checking was to strict... ie nil was an acceptable value and now the code through an assertion error. In many cases a function simply takes the argument and passes it to another function, so I don't really care what type the argument is as long as there is an implementation of the other function which supports that data type, maybe it doesn't know but in the future maybe it will should my code unnecessarily constrain the type based on an implementation detail? I have never been truely sold on duck typing however as I often find the time I spend debugging exceptions thrown deep down in the call stack because an error that could have been caught earlier when the problem was obvious was allowed to penetrate deep into call stack where the problem is no longer obvious often within a third party library that never expected that type of input. In OO the argument is an interface which says nothing about the structure of the object, only that it provides the desired behavior. Protocols are a step in this direction, however, if you extend a protocol to a map then (satisfy? TheProtocol {}) will return true for ALL maps, making satisfies? an otherwise useless precondition. For example, my first contact model I had {... :fields [{:field-type :email :value "" ...}]} I later realized that the endpoint to get the contact information provided them grouped and I was filtering a lot based on field-type anyway so a more effective data structure was {:emails [{:value "" ...}]}, :field-type was just an implementation detail, the email object still has a value and associated behavior but the map no longer contains a :field-type. However, phone numbers have exactly the same {:value ""} structure so the only way to determine if it is an email or a phone number is from context or by parsing the string. > >> This reminded me of another OO code smell.... "Don't talk to strangers" >> and the Law of Demeter, instead sending and returning maps of lists of maps >> I started returning maps of functions. This provided additional decoupling >> that enabled me to refactor a bit more easily additionally maps of maps of >> lists of maps often need to be fully computed where as a map containing >> functions allows me to defer computation until it is actually required >> which may in many cases be never. >> > > Basically you are returning a "lazy map" a map of keys to delayed values > (why not use delays instead of fns as values?), while it is sometimes a > necessity to do so, the implied trade-off must not be overlooked: the map > can't be treated as a value anymore: if you call twice the pure function > which generates such a lazy map twice with the same arguments, you get two > lazy maps which are not equals! I'm not even speaking about being equal to > their non-lazy counterparts (which makes them a bit harder to test) > This is an excellent point. Initially I started passing maps of functions into other functions as optional parameter map for testing (new-correction [.... & {:keys [get-current-date] :or [get-current-date get-current-date]). This solved one problem but created others, the function was now completely deterministic and easy to test, but for more than a function or two quickly becomes verbose, so I created a factory function to create the map and not long after realized these were "objects". At this point I started using records and protocols instead of maps, even an empty record since really what I cared about that this type had associated behavior. So while I used maps of functions in some cases I later refactored away from maps to records. However this has created an entirely new set of frustrations. If I implement the protocol within (defrecord ...) form I find that changing the implementation requires restarting the JVM to which I have probably lost a good hour of debugging problems I already fixed just needed to restart. If I use (extend Type Protocol {...}) I no longer need to reload the JVM, however now the record type will no longer implement the associated java interface. This generally is ok, however I found that Protocols cannot extend other protocols, so if this is desired then you either need to extend the (:oninterface Protocol), extend all known implementations, or create an adapter to delegate to the other protocol. I also tried using multimethods instead of protocols, but I still found I needed to restart the JVM frequently. I think whenever I would re-eval the namespace with the record type and (defmethod ...) a new class file would be emitted and the objects in my swank session need to be recreated. > >> Although very idiomatic to use keywords to get from maps, I have started >> to think of this as a code smell and instead prefer to (def value :value) >> and use this var instead of the keyword because it allows me to later >> replace the implementation or rename properties if it is necessary to >> refactor and I want to minimize changes to existing code or make changes to >> the existing code in small incremental units rather than all at once. >> > > I think this is a premature optimization. If you need to get rid off > keywords acces later on, you can either do what you propose and modify all > the call sites to remove the colon (but the important thing is that it > won't change the shape of your code, it's a minor refactoring) OR if you > are really stuck and don't want to touch the codebase, don't forget that in > Clojure (unless you are using interop) you are always one abstraction away: > you can define your own associative type which will knows how to respond to > lookup for gets. Plus doing so you may even choose to leverage the > optimized code path for keyword lookups (see IKeywordLookup.java). > I disagree. Ironically, if you told a java developer that getters were a premature optimization and he should use fields instead he would look at you funny. Getters are not an optimization and if anything have a minor performance penalty. However, using fields makes one very dependent on implementation details doing so is considered bad practice. I don't see how this is any different in clojure. For example, my email data structure currently has just {:value "u...@domain.com"} however for various reasons it might actually be beneficial for me to use {:user "user" :domain "domain.com"}. Using (def value :value) has practically no significant overhead, but would make this change trivial. However, if I had to find and replace :value everywhere this becomes significantly more difficult and error prone. Even if I grep for :value many maps that aren't emails contain :value so I would need to proceed carefully to ensure not only that I replaced them all, but I did not accidentally replace a :value for a different type. I have also found that if I mistype a keyword I get nil... but if I mistype value the complier will complain that it cannot find the definition. This also allows me to use emacs to find all usages and to "lean on the compiler" during any refactoring. Overriding IKeywordLookup will only get use so far, perhaps the consumer uses the map as a sequence key values which is really a MapEntry which is destructured with nth or perhaps they don't destructure but use key and val instead? And what then if the user tries to assoc a new value clojure will change field within the record value and return a new instance of that record which still implements the IKeywordLookup behavior so the record will appear not to have been updated. Even if I could override all this to make it work, I would much rather find and replace all usages of :value as necessary... or better still just change a single occurrence of (def value :value). There is a quote "better to have 100 functions that work on one data structure than 10 functions on 10 data structures." I once also reiterated this quote, however, in retrospective I completely disagree with this quote. These functions are often leaky and expose implementation details which makes changes unnecessarily difficult. While it may be possible to re-implement these functions to preserve desired it is usually not ideal and often even more difficult to maintain. "I would rather provide 10 functions that are relevant to the problem domain, then support 100 functions which are not." There are still a lot of things I like about clojure, immutable persistent data structures, equality semantics, meta programming and embedded languages such core logic, cascalog, etc. Abstraction and data hiding are different things. In OO I too think private methods are a code smell, (they should be moved to public methods within another class), but the clojure community seems to believe encapsulation is only about data hiding and does not currently seem to value the age old abstraction principle. Kurt > Christophe > -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en