Hi @atdixon and Thad, thanks for your help. I provide more details about my project My big data layer is inspired by Lambda architecture. The pipeline include following layers and related tool choosed to address the issue: - *Nifi* for *data ingestion*, and publisinh data/message on kafka topic. - *Kafka* as *message broker* that with kafka connect, allow me to store data in mongodb ( with mongodb sink and 1 day retention period ) and HDFS (hdfk sink with 1 year retention period) - *Real time processing* with *mongoDB* using it's built-in QueryEngine taht provides extensive Querying, Filtering, and Searching abilities. - *Batch processing* of data stored on HDFS, that performs data aggregation and store result on a HBase Table. *?* The question is : Which tool do you suggest to use for data processing sotred on HDFS ? - *Serving Layer* with *HBase/Phoneix* to store and allow access to batch view.
Now i'm invoking your help to choose *the most appropriate tool to execute batch jobs (map reduce)* which will have to aggregate data. Natahn Marz suggests Clojure/Cascalog. Do you know other excellent clojure/Hadoop work in the community, about data processing? if you know some particularly appropriate tools, I could also consider other work/library outside the clojure community. Thanks Il giorno mercoledì 3 luglio 2019 14:56:09 UTC+2, Thad Guidry ha scritto: > > "The best code is never written" > > https://zeppelin.apache.org/ > https://nifi.apache.org/ > > Thad > https://www.linkedin.com/in/thadguidry/ > > > On Tue, Jul 2, 2019 at 11:07 AM orazio <orazio...@gmail.com <javascript:>> > wrote: > >> Hi All, >> >> I'm newbie on Clojure/Big Data, and i'm starting with hadoop. >> I have installed Hortonworks HDP 3.1 >> I have to design a Big Data Layer that ingests large iot datasets and >> social media datasets, process data with MapReduce job and produce >> aggregation to store on HBASE tables. >> >> For now, my focus is addressed on data processing issue. My question is: >> Is Clojure a good choice for distributed data processing on hadoop ? >> I found Cascalog as fully-featured data processing and querying library >> for Clojure or Java. But are there any active maintainers, for this library >> ? >> Do you know other excellent clojure/Hadoop work in the community, abaout >> data processing? >> >> I would appreciate some help. >> >> Orazio >> >> -- >> You received this message because you are subscribed to the Google >> Groups "Clojure" group. >> To post to this group, send email to clo...@googlegroups.com >> <javascript:> >> Note that posts from new members are moderated - please be patient with >> your first post. >> To unsubscribe from this group, send email to >> clo...@googlegroups.com <javascript:> >> For more options, visit this group at >> http://groups.google.com/group/clojure?hl=en >> --- >> You received this message because you are subscribed to the Google Groups >> "Clojure" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to clo...@googlegroups.com <javascript:>. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/clojure/fbc26ffb-5f00-46a7-bf33-7a899f1ffead%40googlegroups.com >> >> <https://groups.google.com/d/msgid/clojure/fbc26ffb-5f00-46a7-bf33-7a899f1ffead%40googlegroups.com?utm_medium=email&utm_source=footer> >> . >> For more options, visit https://groups.google.com/d/optout. >> > -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en --- You received this message because you are subscribed to the Google Groups "Clojure" group. To unsubscribe from this group and stop receiving emails from it, send an email to clojure+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/clojure/25a56148-9231-4a1b-8bba-8cb79776ba6b%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.