Hi Jeremy, thanks for the link. I doubled the rpc_timeout (20 seconds) and reduced the range-batch-size to 2048, but I still get timeouts...
Udo Am 29.04.2011 um 18:53 schrieb Jeremy Hanna: > It sounds like there might be some tuning you can do to your jobs - take a > look at the wiki's HadoopSupport page, specifically the Troubleshooting > section: > http://wiki.apache.org/cassandra/HadoopSupport#Troubleshooting > > On Apr 29, 2011, at 11:45 AM, Subscriber wrote: > >> Hi all, >> >> We want to share our experiences we got during our Cassandra plus Hadoop >> Map/Reduce evaluation. >> Our question was whether Cassandra is suitable for massive distributed data >> writes using Hadoop's Map/Reduce feature. >> >> Our setup is described in the attached file 'cassandra_stress_setup.txt'. >> >> <cassandra_stress_setup.txt> >> >> The stress test uses 800 map-tasks to generate data and store it into >> cassandra. >> Each map task writes 500.000 items (i.e. rows) resulting in totally >> 400.000.000 items. >> There are max. 8 map tasks in parallel on each node. An item contains >> (beside the key) two long and two double values, >> so that items are a few 100 bytes in size. This leads to a total data size >> of approximately 120GB. >> >> The Map-Tasks uses the Hector API. Hector is "feeded" with all three data >> nodes. The data is written in chunks of 1000 items. >> The ConsitencyLevel is set to ONE. >> >> We ran the stress tests in several runs with different configuration >> settings (for example I started with cassandra's default configuration and I >> used Pelops for another test). >> >> Our observations are like this: >> >> 1) Cassandra is really fast - we are really impressed about the huge write >> throughput. A map task writing 500.000 items (appr. 200MB) usually finishes >> under 5 minutes. >> 2) However - unfortunately all tests failed in the end >> >> In the beginning there are no problems. The first 100 (in some tests the >> first 300(!)) map tasks are looking fine. But then the trouble starts. >> >> Hadoop's sample output after ~15 minutes: >> >> Kind % Complete Num Tasks Pending Running Complete Killed >> Failed/Killed Task Attempts >> map 14.99% 800 680 24 96 0 >> 0 / 0 >> reduce 3.99% 1 0 1 0 >> 0 0 / 0 >> >> Some stats: >>> top >> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND >> >> >> >> 31159 xxxx 20 0 2569m 2.2g 9.8m S 450 18.6 61:44.73 java >> >> >> >> >>> vmstat 1 5 >> procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu---- >> r b swpd free buff cache si so bi bo in cs us sy id wa >> 2 1 36832 353688 242820 6837520 0 0 15 73 3 2 3 0 96 0 >> 11 1 36832 350992 242856 6852136 0 0 1024 20900 4508 11738 19 1 74 >> 6 >> 8 0 36832 339728 242876 6859828 0 0 0 1068 45809 107008 69 10 >> 20 0 >> 1 0 36832 330212 242884 6868520 0 0 0 80 42112 92930 71 8 21 >> 0 >> 2 0 36832 311888 242908 6887708 0 0 1024 0 20277 46669 46 7 47 >> 0 >> >>> cassandra/bin/nodetool -h tirdata1 -p 28080 ring >> Address Status State Load Owns Token >> >> >> 113427455640312821154458202477256070484 >> 192.168.11.198 Up Normal 6.72 GB 33.33% 0 >> >> 192.168.11.199 Up Normal 6.72 GB 33.33% >> 56713727820156410577229101238628035242 >> 192.168.11.202 Up Normal 6.68 GB 33.33% >> 113427455640312821154458202477256070484 >> >> >> Hadoop's sample output after ~20 minutes: >> >> Kind % Complete Num Tasks Pending Running Complete Killed >> Failed/Killed Task Attempts >> map 15.49% 800 673 24 103 0 >> 6 / 0 >> reduce 4.16% 1 0 1 0 >> 0 0 / 0 >> >> What went wrong? It's always the same. The clients cannot reach the nodes >> anymore. >> >> java.lang.RuntimeException: work failed >> at >> com.zfabrik.hadoop.impl.HadoopProcessRunner.work(HadoopProcessRunner.java:109) >> at >> com.zfabrik.hadoop.impl.DelegatingMapper.run(DelegatingMapper.java:40) >> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:625) >> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305) >> at org.apache.hadoop.mapred.Child.main(Child.java:170) >> Caused by: java.lang.reflect.InvocationTargetException >> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >> at >> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) >> at >> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) >> at java.lang.reflect.Method.invoke(Method.java:597) >> at >> com.zfabrik.hadoop.impl.HadoopProcessRunner.work(HadoopProcessRunner.java:107) >> ... 4 more >> Caused by: java.lang.RuntimeException: >> me.prettyprint.hector.api.exceptions.HUnavailableException: : May not be >> enough replicas present to handle consistency level. >> at >> com.zfabrik.hadoop.impl.DelegatingMapper$1.run(DelegatingMapper.java:47) >> at com.zfabrik.work.WorkUnit.work(WorkUnit.java:342) >> at >> com.zfabrik.impl.launch.ProcessRunnerImpl.work(ProcessRunnerImpl.java:189) >> ... 9 more >> Caused by: me.prettyprint.hector.api.exceptions.HUnavailableException: : May >> not be enough replicas present to handle consistency level. >> at >> me.prettyprint.cassandra.service.ExceptionsTranslatorImpl.translate(ExceptionsTranslatorImpl.java:52) >> at >> me.prettyprint.cassandra.service.KeyspaceServiceImpl$1.execute(KeyspaceServiceImpl.java:95) >> at >> me.prettyprint.cassandra.service.KeyspaceServiceImpl$1.execute(KeyspaceServiceImpl.java:88) >> at >> me.prettyprint.cassandra.service.Operation.executeAndSetResult(Operation.java:101) >> at >> me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:221) >> at >> me.prettyprint.cassandra.service.KeyspaceServiceImpl.operateWithFailover(KeyspaceServiceImpl.java:129) >> at >> me.prettyprint.cassandra.service.KeyspaceServiceImpl.batchMutate(KeyspaceServiceImpl.java:100) >> at >> me.prettyprint.cassandra.service.KeyspaceServiceImpl.batchMutate(KeyspaceServiceImpl.java:106) >> at >> me.prettyprint.cassandra.model.MutatorImpl$2.doInKeyspace(MutatorImpl.java:203) >> at >> me.prettyprint.cassandra.model.MutatorImpl$2.doInKeyspace(MutatorImpl.java:200) >> at >> me.prettyprint.cassandra.model.KeyspaceOperationCallback.doInKeyspaceAndMeasure(KeyspaceOperationCallback.java:20) >> at >> me.prettyprint.cassandra.model.ExecutingKeyspace.doExecute(ExecutingKeyspace.java:85) >> at >> me.prettyprint.cassandra.model.MutatorImpl.execute(MutatorImpl.java:200) >> at >> sample.cassandra.itemrepo.mapreduce.HectorBasedMassItemGenMapper._flush(HectorBasedMassItemGenMapper.java:122) >> at >> sample.cassandra.itemrepo.mapreduce.HectorBasedMassItemGenMapper.map(HectorBasedMassItemGenMapper.java:103) >> at >> sample.cassandra.itemrepo.mapreduce.HectorBasedMassItemGenMapper.map(HectorBasedMassItemGenMapper.java:1) >> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) >> at >> com.zfabrik.hadoop.impl.DelegatingMapper$1.run(DelegatingMapper.java:45) >> ... 11 more >> Caused by: UnavailableException() >> at >> org.apache.cassandra.thrift.Cassandra$batch_mutate_result.read(Cassandra.java:16485) >> at >> org.apache.cassandra.thrift.Cassandra$Client.recv_batch_mutate(Cassandra.java:916) >> at >> org.apache.cassandra.thrift.Cassandra$Client.batch_mutate(Cassandra.java:890) >> at >> me.prettyprint.cassandra.service.KeyspaceServiceImpl$1.execute(KeyspaceServiceImpl.java:93) >> ... 27 more >> >> ------- >> Task attempt_201104291345_0001_m_000028_0 failed to report status for 602 >> seconds. Killing! >> >> >> I also observed also that when connecting to cassandra-cli during the stress >> test, it was not possible to list the items written so far: >> >> [default@unknown] use ItemRepo; >> Authenticated to keyspace: ItemRepo >> [default@ItemRepo] list Items; >> Using default limit of 100 >> Internal error processing get_range_slices >> >> >> It seems to me that from the point I performed the read operation in the cli >> tool, the node becomes somehow confused. >> Looking on the jconsole shows that up to this point the heap is well: it >> grows and gcs clears it again. >> But from this point on, gcs doesn't really help anymore (see attached >> screenshot). >> >> >> <Bildschirmfoto 2011-04-29 um 18.30.14.png> >> >> >> This has also impact on the other nodes as one can see in the second >> screenshot. The CPU Usage goes down as well as the heap memory usage. >> <Bildschirmfoto 2011-04-29 um 18.36.34.png> >> >> I'll run another stress test at the weekend with >> >> * MAX_HEAP_SIZE="4G" >> * HEAP_NEWSIZE="400M" >> >> Best Regards >> Udo >> >