Hi Clojure people, I'm currently working on some problems in the big data space, and I'm more or less starting from scratch with the Hadoop ecosystem. I was looking at ways to work with data in Hadoop, and I realized that (because of how InputFormat splitting works) this is a use case where it's actually pretty important to use a data language with an external schema. This probably means ruling out Edn (for performance and space efficiency reasons) and Fressian (managing the Fressian caching domain seems like it could get complicated), which are my default solutions for everything, so now I'm back to the drawing board. I'd rather not use something braindead like JSON or CSV.
It seems like there are a few language-agnostic data languages that are popular in this space, such as: * Thrift * Protobuf * Avro But since the Clojure community has very high standards for data languages, as well as a number of different libraries that run code on Hadoop, I was wondering if anyone could provide a recommendation for a fast, extensible, and well-designed data language to use. (Recommendations of what to avoid are also welcome.) -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en --- You received this message because you are subscribed to the Google Groups "Clojure" group. To unsubscribe from this group and stop receiving emails from it, send an email to clojure+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.