I like using Yelp's mrjob module (https://github.com/Yelp/mrjob) to run Python on Hadoop.
On Thu, Jun 9, 2016 at 2:56 AM Ho Yeung Lee <davidbenny2...@gmail.com> wrote: > [... a bunch of code ...] If you want to describe a map-reduce problem, start with the data. What does a record of your input data look like? Then think about your mapper. What key-value pairs will you extract from each line of data? Then think about your reducer. For a single key and its associated values, what will you calculate? -- https://mail.python.org/mailman/listinfo/python-list