Re: [Python] [OT] Cheap MapReduce in Go

Riccardo Magliocchetti Mon, 13 Jul 2015 11:36:14 -0700

Il 13/07/2015 20:20, Carlo Miron ha scritto:

< http://marcio.io/2015/07/cheap-mapreduce-in-go/>


tl;dr

Sometimes you don’t need overly complex infrastructures or systems to do a job
well. In this case, we were running these exact same aggregations over close to
20 EMR instances that would take a few minutes to execute the entire MapReduce
job over hundreds of Gigabytes of data each day.

When we decided to take a look at this problem again, we rewrote this task using
Go, and we now simply run this on a single 8-core machine and the whole daily
execution takes about 10 minutes. We cut a lot of the costs associated with
maintaining and running these EMR systems and we just schedule this Go app to
run once a day over our daily dataset.

You can find the entire code here:
https://gist.github.com/mcastilho/e051898d129b44e2f502

Qualche tempo fa era uscito qualcosa del genere dove veniva usata una commoventepipeline:

http://aadrake.com/command-line-tools-can-be-235x-faster-than-your-hadoop-cluster.html

--
Riccardo Magliocchetti
@rmistaken

http://menodizero.it
_______________________________________________
Python mailing list
[email protected]
http://lists.python.it/mailman/listinfo/python

Re: [Python] [OT] Cheap MapReduce in Go

Rispondere a