My biased first reaction to Hadoop is, do you really need it? It has a separate
runtime, some overhead. And it seems to me it much easier to use Kafka,
probably connect to get data in/out and Streams/Ksql to process the data.
Because of Java interop and the nice generic Kafka Api it's really eas
I'm glad someone else is thinking on this too!
#2 - For my case at the moment (Apache Beam), I believe we will always know
the types in advance so using a Java class is workable but of course a
(proxy++) would be ideal. Beam asks for us to extend abstract generic class
so we must use (proxy). I
5. If you need a concrete class definition that then implements a set of
type specific interfaces this would seem to fall into a category of
gen-class assuming you could specify the interfaces with type
specifications. I can't immediately place a way to do this with anything
mentioned above. It
eglue,
1. I think this is a great idea if it is really necessary. I would be in
favor of a reify++ alone to simplify things. I find reify amazing at code
compression and heavily use it via type specific macros to implement
interfaces that for instance support a particular primitive type.
2. Is
I've found Clojure to be an excellent fit for big data processing for a few
reasons:
- the nature of big data is that it is often unstructured or
semi-structured, and Clojure's immutable ad hoc map-based orientation is
well suited to this
- much of the big data ecosystem is Java or JVM-based (a
Hi All,
I'm newbie on Clojure/Big Data, and i'm starting with hadoop.
I have installed Hortonworks HDP 3.1
I have to design a Big Data Layer that ingests large iot datasets and
social media datasets, process data with MapReduce job and produce
aggregation to store on HBASE tables.
For now, my