you may be running into this - https://issues.apache.org/jira/browse/CASSANDRA-3942 - I'm not sure if it really affects the execution of the job itself though.
On Mar 6, 2012, at 2:32 AM, Patrik Modesto wrote: > Hi, > > I was recently trying Hadoop job + cassandra-all 0.8.10 again and the > Timeouts I get are not because of the Cassandra can't handle the > requests. I've noticed there are several tasks that show proggess of > several thousands percents. Seems like they are looping their range of > keys. I've run the job with debug enabled and the ranges look ok, see > http://pastebin.com/stVsFzLM > > Another difference between cassandra-all 0.8.7 and 0.8.10 is the > number of mappers the job creates: > 0.8.7: 4680 > 0.8.10: 595 > > Task Complete > task_201202281457_2027_m_000041 9076.81% > task_201202281457_2027_m_000073 9639.04% > task_201202281457_2027_m_000105 10538.60% > task_201202281457_2027_m_000108 9364.17% > > None of this happens with cassandra-all 0.8.7. > > Regards, > P. > > > > On Tue, Feb 28, 2012 at 12:29, Patrik Modesto <patrik.mode...@gmail.com> > wrote: >> I'll alter these settings and will let you know. >> >> Regards, >> P. >> >> On Tue, Feb 28, 2012 at 09:23, aaron morton <aa...@thelastpickle.com> wrote: >>> Have you tried lowering the batch size and increasing the time out? Even >>> just to get it to work. >>> >>> If you get a TimedOutException it means CL number of servers did not respond >>> in time. >>> >>> Cheers >>> >>> ----------------- >>> Aaron Morton >>> Freelance Developer >>> @aaronmorton >>> http://www.thelastpickle.com >>> >>> On 28/02/2012, at 8:18 PM, Patrik Modesto wrote: >>> >>> Hi aaron, >>> >>> this is our current settings: >>> >>> <property> >>> <name>cassandra.range.batch.size</name> >>> <value>1024</value> >>> </property> >>> >>> <property> >>> <name>cassandra.input.split.size</name> >>> <value>16384</value> >>> </property> >>> >>> rpc_timeout_in_ms: 30000 >>> >>> Regards, >>> P. >>> >>> On Mon, Feb 27, 2012 at 21:54, aaron morton <aa...@thelastpickle.com> wrote: >>> >>> What settings do you have for cassandra.range.batch.size >>> >>> and rpc_timeout_in_ms ? Have you tried reducing the first and/or increasing >>> >>> the second ? >>> >>> >>> Cheers >>> >>> >>> ----------------- >>> >>> Aaron Morton >>> >>> Freelance Developer >>> >>> @aaronmorton >>> >>> http://www.thelastpickle.com >>> >>> >>> On 27/02/2012, at 8:02 PM, Patrik Modesto wrote: >>> >>> >>> On Sun, Feb 26, 2012 at 04:25, Edward Capriolo <edlinuxg...@gmail.com> >>> >>> wrote: >>> >>> >>> Did you see the notes here? >>> >>> >>> >>> I'm not sure what do you mean by the notes? >>> >>> >>> I'm using the mapred.* settings suggested there: >>> >>> >>> <property> >>> >>> <name>mapred.max.tracker.failures</name> >>> >>> <value>20</value> >>> >>> </property> >>> >>> <property> >>> >>> <name>mapred.map.max.attempts</name> >>> >>> <value>20</value> >>> >>> </property> >>> >>> <property> >>> >>> <name>mapred.reduce.max.attempts</name> >>> >>> <value>20</value> >>> >>> </property> >>> >>> >>> But I still see the timeouts that I haven't with cassandra-all 0.8.7. >>> >>> >>> P. >>> >>> >>> http://wiki.apache.org/cassandra/HadoopSupport#Troubleshooting >>> >>> >>> >>>