Re: Doseq, map-style

Kurt Harriger Sat, 16 Jun 2012 10:58:36 -0700


On Wednesday, June 13, 2012 2:52:15 AM UTC-6, Vinzent wrote:
>
> I do use pre-conditions where the test condition is simple, ie is this a 
>> string? does the map have a :field-type? however I get a lot of my input 
>> data from http requests as json which have similar structures but different 
>> semantics, so I often do not have preconditions where type is not 
>> explicit. For example a string could be an list id or a contact id or an 
>> encoded json document. While it is possible to try to parse a string for 
>> json to verify its type this is seems very computationally expensive and 
>> therefore usually inferred from the context. 
>>
>
> Why do you care about computational expensiveness of pre and post 
> conditions? They'll be turned off in production anyway.
>


How does one turn off pre/post conditions in production?  I agree with Sean 
however and generally think that assertions in production a bad thing.
 

>  
>
>> For example, my first contact model I had {... :fields [{:field-type 
>>  :email :value "" ...}]}  I later realized that the endpoint to get the 
>> contact information provided them grouped and I was filtering a lot based 
>> on field-type anyway so a more effective data structure was {:emails 
>> [{:value "" ...}]}, :field-type was just an implementation detail, the 
>> email object still has a value and associated behavior but the map no 
>> longer contains a :field-type.  However, phone numbers have exactly the 
>> same {:value ""} structure so the only way to determine if it is an email 
>> or a phone number is from context or by parsing the string.
>>
>
> I believe using records or adding :type metadata would be an ideal fit in 
> this case. 
>

I agree, an explicit type field makes dispatching easy.  However this data 
structure returned by (http/get ... :as json) so if I want to add type 
information I need to walk the tree and rewrite it.  Not necessarily a bad 
idea, but in some cases the only thing I need is the eTag and so the 
additional processing may in some cases unnecessary. One could easily make 
data conversions lazy by doing something like (defrecord Contact [contact]) 
(defmethod emails Contact [contact] (map map->Email (:emails contact)) to 
delay the computation until the values are actually requested.  However, 
note that emails is now a multimethod method not a value and the consumer 
needs to use (emails contact) rather than (:emails contact)... Thus as I 
was saying previously is that (def emails :emails) gives you the 
flexibility to delay computation if desired.  


> I also tried using multimethods instead of protocols, but I still found I 
>> needed to restart the JVM frequently.  I think whenever I would re-eval the 
>> namespace with the record type and (defmethod ...) a new class file would 
>> be emitted and the objects in my swank session need to be recreated.   
>>
>
> defmulti has defonce semantic; you can use (def foo nil) (defmulti foo 
> ...) to workaround this. A topic about this problem was recently created in 
> dev mail list, so it'll probably be fixed soon. 
>
> I disagree.  Ironically, if you told a java developer that getters were a 
>> premature optimization and he should use fields instead he would look at 
>> you funny.  Getters are not an optimization and if anything have a minor 
>> performance penalty.  However, using fields makes one very dependent on 
>> implementation details doing so is considered bad practice. I don't see how 
>> this is any different in clojure.  
>>
>
> In clojure, structure of your map is a part of the contract. It's not an 
> implementation detail - it's the same as getters\setters in java.
>

EXACTLY my point!  Clojure does not distinguish between properties and data 
representation and these are NOT the same thing.  There are many different 
ways to represent data. For example the area of a shape can be represented 
in many different ways, square inches, square miles, a rectangle, circles, 
polygons, or perhaps complex geometry requiring calculus all of which could 
be asked what is your area in square feet.  Area is a property of the 
object, the width, radius, number of sides, etc is an implementation 
detail. 

You may then ask so why don't you just pass in {:area } as square feet 
instead of the radius of the circle?  Because the value may not be used by 
the function.  If its not used then why is it part of the contract? 
 Because it may be used conditionally, for example, maybe the function 
needs to find the first shape that will fit within a region once that limit 
is reached it no longer requires the area for any other shapes.  So if the 
shape requires complex calculus which has been written in another 
programming language and thus requires a rpc call to a network service to 
compute the value that is only used sometimes seems wasteful and 
inefficient if the value is only sometimes computed.  This example is 
somewhat contrived, but it is not that different from what I am doing.

For each contact record I need to perform data enrichment, normalization, 
cleansing, and feature extraction and calculate similarity scores to 
determine if contacts are likely to be the same and some things only need 
to be calculated when other conditions are met.  For example name 
similarity can be very complex.  We actually have a service that uses US 
census data to determine how common or uncommon a persons name is within a 
region, however if based on other information the contacts can be 
classified with sufficient precision this computation is unnecessary. 

My point is that properties with getter functions allow you to defer 
computation, keywords do not.  Keywords are not like java getters they are 
like java fields.  Instead of (:property themap), one should use (def 
property :property) (property themap).  This allows you to defer 
computation and change data representation if necessary or desired. It is a 
bad practice in java and IMHO it should be considered bad practice in 
clojure for EXACTLY the same reasons as they are bad practice in OO.  And 
with macros programming it is really easy to many properties at once with 
only a line or two, so I don't see why there is so much resistance to using 
them.  

 
>
>> For example, my email data structure currently has just {:value "
>> u...@domain.com"} however for various reasons it might actually be 
>> beneficial for me to use {:user "user" :domain "domain.com"}. 
>>
>
> Well, this is a contrived example, since you'd use a plain string instead 
> of a map with the single :value key :) Given that, user and domain would 
> probably be functions.
>

Actually this is only somewhat contrived.  It is not uncommon for a user to 
the same nickname in his email nickn...@domain.com and in twitter handle, 
and this is a useful similarity feature when this computation is performed 
for each *pair* of field in each *pair* of contacts this computation may 
need to be performed millions of times.

 

>
> There are still a lot of things I like about clojure, immutable persistent 
>> data structures, equality semantics, meta programming and embedded 
>> languages such core logic, cascalog, etc.  Abstraction and data hiding are 
>> different things.  In OO I too think private methods are a code smell, 
>> (they should be moved to public methods within another class), but the 
>> clojure community seems to believe encapsulation is only about data hiding 
>> and does not currently seem to value the age old abstraction principle.
>>
>  
> I think it's rather in the process of finding it's own way of creating 
> abstractions.
>

Perhaps lisp programmers already did? CLOS and OO was born?


 

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Re: Doseq, map-style

Reply via email to