Hi @atdixon and Thad, thanks for your help.

I provide more details about my project
My big data layer  is inspired by Lambda architecture. The pipeline include 
following layers and related tool choosed to address the issue:
- *Nifi* for *data ingestion*, and publisinh data/message on  kafka topic.
- *Kafka* as *message broker* that with kafka connect, allow me to store 
data in mongodb ( with mongodb sink and 1 day retention period ) and HDFS 
(hdfk sink with 1 year retention period)
- *Real time processing* with *mongoDB* using it's built-in QueryEngine 
taht provides extensive Querying, Filtering, and Searching abilities.
- *Batch processing* of data stored on HDFS, that performs data aggregation 
and store result on a HBase Table. *?* The question is : Which tool do you 
suggest to use for data processing sotred on HDFS ?
- *Serving Layer* with *HBase/Phoneix* to store and allow access to batch 
view.

Now i'm invoking your help to choose *the most appropriate tool to execute 
batch jobs (map reduce)* which will have to aggregate data.
Natahn Marz suggests Clojure/Cascalog. Do you know other excellent 
clojure/Hadoop work in the community, about data processing?
if you know some particularly appropriate tools, I could also consider 
other work/library outside the clojure community.

Thanks



Il giorno mercoledì 3 luglio 2019 14:56:09 UTC+2, Thad Guidry ha scritto:
>
> "The best code is never written"
>
> https://zeppelin.apache.org/ 
> https://nifi.apache.org/  
>  
> Thad
> https://www.linkedin.com/in/thadguidry/
>
>
> On Tue, Jul 2, 2019 at 11:07 AM orazio <orazio...@gmail.com <javascript:>> 
> wrote:
>
>> Hi All,
>>
>> I'm newbie on Clojure/Big Data, and i'm starting with hadoop.
>> I have installed Hortonworks HDP 3.1 
>> I have to design a Big Data Layer that ingests large iot datasets and 
>> social media datasets, process data with MapReduce job and produce 
>> aggregation to store on HBASE tables.
>>
>> For now, my focus is addressed on data processing issue. My question is: 
>> Is Clojure a good choice for distributed data processing on hadoop ?
>> I found Cascalog as fully-featured data processing and querying library 
>> for Clojure or Java. But are there any active maintainers, for this library 
>> ? 
>> Do you know other excellent clojure/Hadoop work in the community, abaout 
>> data processing? 
>>
>> I would appreciate some help.
>>
>> Orazio
>>
>> -- 
>> You received this message because you are subscribed to the Google
>> Groups "Clojure" group.
>> To post to this group, send email to clo...@googlegroups.com 
>> <javascript:>
>> Note that posts from new members are moderated - please be patient with 
>> your first post.
>> To unsubscribe from this group, send email to
>> clo...@googlegroups.com <javascript:>
>> For more options, visit this group at
>> http://groups.google.com/group/clojure?hl=en
>> --- 
>> You received this message because you are subscribed to the Google Groups 
>> "Clojure" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to clo...@googlegroups.com <javascript:>.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/clojure/fbc26ffb-5f00-46a7-bf33-7a899f1ffead%40googlegroups.com
>>  
>> <https://groups.google.com/d/msgid/clojure/fbc26ffb-5f00-46a7-bf33-7a899f1ffead%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/clojure/25a56148-9231-4a1b-8bba-8cb79776ba6b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to