Re: Experiences with Map&Reduce Stress Tests

2011-05-03 Thread Jeremy Hanna
Writing to Cassandra from map/reduce jobs over HDFS shouldn't be a problem. We're doing it in our cluster and I know of others doing the same thing. You might just make sure the number of reducers (or mappers) writing to cassandra don't overwhelm it. There's no data locality for writes, thoug

Re: Experiences with Map&Reduce Stress Tests

2011-05-03 Thread Subscriber
Hi Jeremy, yes, the setup on the data-nodes is: - Hadoop DataNode - Hadoop TaskTracker - CassandraDaemon However - the map-input is not read from Cassandra. I am running a writing stress test - no reads (well from time to time I check the produced items using cassandra

Re: Experiences with Map&Reduce Stress Tests

2011-05-02 Thread Jeremy Hanna
Udo, One thing to get out of the way - you're running task trackers on all of your cassandra nodes, right? That is the first and foremost way to get good performance. Otherwise you don't have data locality, which is really the point of map/reduce, co-locating your data and your processes oper

Re: Experiences with Map&Reduce Stress Tests

2011-05-02 Thread Subscriber
Hi Jeremy, thanks for the link. I doubled the rpc_timeout (20 seconds) and reduced the range-batch-size to 2048, but I still get timeouts... Udo Am 29.04.2011 um 18:53 schrieb Jeremy Hanna: > It sounds like there might be some tuning you can do to your jobs - take a > look at the wiki's Hado

Re: Experiences with Map&Reduce Stress Tests

2011-04-29 Thread Jeremy Hanna
It sounds like there might be some tuning you can do to your jobs - take a look at the wiki's HadoopSupport page, specifically the Troubleshooting section: http://wiki.apache.org/cassandra/HadoopSupport#Troubleshooting On Apr 29, 2011, at 11:45 AM, Subscriber wrote: > Hi all, > > We want to sh