FWIW, We use edn (serialized with nippy [1]) in hadoop & it works very well for 
us: 

https://github.com/Netflix/PigPen 

In some places we use maps for the expressiveness and in some we use vectors 
for more performance.

Whatever I lose in raw performance I can trivially throw a few more boxes at, 
so it makes it a non-issue for us. The flexibility of edn outweighs any 
performance gains of converting back & forth to another format and having to 
worry about translation errors.

-Matt

[1] https://github.com/ptaoussanis/nippy


On Tuesday, August 4, 2015 at 7:05 PM, Ryan Schmitt wrote:

> Hi Clojure people,
> 
> I'm currently working on some problems in the big data space, and I'm more or 
> less starting from scratch with the Hadoop ecosystem. I was looking at ways 
> to work with data in Hadoop, and I realized that (because of how InputFormat 
> splitting works) this is a use case where it's actually pretty important to 
> use a data language with an external schema. This probably means ruling out 
> Edn (for performance and space efficiency reasons) and Fressian (managing the 
> Fressian caching domain seems like it could get complicated), which are my 
> default solutions for everything, so now I'm back to the drawing board. I'd 
> rather not use something braindead like JSON or CSV.
> 
> It seems like there are a few language-agnostic data languages that are 
> popular in this space, such as:
> 
> * Thrift
> * Protobuf
> * Avro
> 
> But since the Clojure community has very high standards for data languages, 
> as well as a number of different libraries that run code on Hadoop, I was 
> wondering if anyone could provide a recommendation for a fast, extensible, 
> and well-designed data language to use. (Recommendations of what to avoid are 
> also welcome.)
> -- 
> You received this message because you are subscribed to the Google
> Groups "Clojure" group.
> To post to this group, send email to clojure@googlegroups.com 
> (mailto:clojure@googlegroups.com)
> Note that posts from new members are moderated - please be patient with your 
> first post.
> To unsubscribe from this group, send email to
> clojure+unsubscr...@googlegroups.com 
> (mailto:clojure+unsubscr...@googlegroups.com)
> For more options, visit this group at
> http://groups.google.com/group/clojure?hl=en
> --- 
> You received this message because you are subscribed to the Google Groups 
> "Clojure" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to clojure+unsubscr...@googlegroups.com 
> (mailto:clojure+unsubscr...@googlegroups.com).
> For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to