Phillip,

We've had great success writing simple, project specific algorithms to
split content into chunks appropriate for ETL type, Python based
processing in a hosted cloud environment like Amazon EC2 or the recently
launched Rackspace Cloud Servers. Since we're purchasing our cloud
hosting time in 1 hour blocks, we divide our data into much larger
chunks than what a traditional map-reduce technique might use. For many
of our projects, the data transfer time to and from the cloud takes the
majority of clock time.

Malcolm
-- 
http://mail.python.org/mailman/listinfo/python-list

Reply via email to