> 2013-04-23 16:09:17,838 INFO > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader: > Current split being processed ColumnFamilySplit((9197470410121435301, '-1] > @[p00nosql02.00, p00nosql01.00]) > Why it's split data from two nodes? we have 6 nodes cassandra cluster + > hadoop slaves - every task should get local input split from local cassandra > - am i right? My understanding is that it may get it locally, but it's not something that has to happen. Once of the Hadoop guys will have a better idea.
Try reducing the cassandra.range.batch.size and/or if you are using wide rows enable cassandra.input.widerows Cheers ----------------- Aaron Morton Freelance Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 25/04/2013, at 7:55 PM, Shamim <sre...@yandex.ru> wrote: > Hello Aaron, > I have got the following Log from the server (Sorry for being late) > > job_201304231203_0004 > attempt_201304231203_0004_m_000501_0 > > 2013-04-23 16:09:14,196 INFO org.apache.hadoop.util.NativeCodeLoader: > Loaded the native-hadoop library > 2013-04-23 16:09:14,438 INFO > org.apache.hadoop.filecache.TrackerDistributedCacheManager: Creating symlink: > /egov/data/hadoop/mapred/local/taskTracker/cassandra/jobcache/job_201304231203_0004/jars/pigContext > <- > /egov/data/hadoop/mapred/local/taskTracker/cassandra/jobcache/job_201304231203_0004/attempt_201304231203_0004_m_000501_0/work/pigContext > 2013-04-23 16:09:14,453 INFO > org.apache.hadoop.filecache.TrackerDistributedCacheManager: Creating symlink: > /egov/data/hadoop/mapred/local/taskTracker/cassandra/jobcache/job_201304231203_0004/jars/dk > <- > /egov/data/hadoop/mapred/local/taskTracker/cassandra/jobcache/job_201304231203_0004/attempt_201304231203_0004_m_000501_0/work/dk > 2013-04-23 16:09:14,456 INFO > org.apache.hadoop.filecache.TrackerDistributedCacheManager: Creating symlink: > /egov/data/hadoop/mapred/local/taskTracker/cassandra/jobcache/job_201304231203_0004/jars/META-INF > <- > /egov/data/hadoop/mapred/local/taskTracker/cassandra/jobcache/job_201304231203_0004/attempt_201304231203_0004_m_000501_0/work/META-INF > 2013-04-23 16:09:14,459 INFO > org.apache.hadoop.filecache.TrackerDistributedCacheManager: Creating symlink: > /egov/data/hadoop/mapred/local/taskTracker/cassandra/jobcache/job_201304231203_0004/jars/org > <- > /egov/data/hadoop/mapred/local/taskTracker/cassandra/jobcache/job_201304231203_0004/attempt_201304231203_0004_m_000501_0/work/org > 2013-04-23 16:09:14,469 INFO > org.apache.hadoop.filecache.TrackerDistributedCacheManager: Creating symlink: > /egov/data/hadoop/mapred/local/taskTracker/cassandra/jobcache/job_201304231203_0004/jars/com > <- > /egov/data/hadoop/mapred/local/taskTracker/cassandra/jobcache/job_201304231203_0004/attempt_201304231203_0004_m_000501_0/work/com > 2013-04-23 16:09:14,471 INFO > org.apache.hadoop.filecache.TrackerDistributedCacheManager: Creating symlink: > /egov/data/hadoop/mapred/local/taskTracker/cassandra/jobcache/job_201304231203_0004/jars/.job.jar.crc > <- > /egov/data/hadoop/mapred/local/taskTracker/cassandra/jobcache/job_201304231203_0004/attempt_201304231203_0004_m_000501_0/work/.job.jar.crc > 2013-04-23 16:09:14,474 INFO > org.apache.hadoop.filecache.TrackerDistributedCacheManager: Creating symlink: > /egov/data/hadoop/mapred/local/taskTracker/cassandra/jobcache/job_201304231203_0004/jars/job.jar > <- > /egov/data/hadoop/mapred/local/taskTracker/cassandra/jobcache/job_201304231203_0004/attempt_201304231203_0004_m_000501_0/work/job.jar > 2013-04-23 16:09:17,329 INFO org.apache.hadoop.util.ProcessTree: setsid > exited with exit code 0 > 2013-04-23 16:09:17,387 INFO org.apache.hadoop.mapred.Task: Using > ResourceCalculatorPlugin : > org.apache.hadoop.util.LinuxResourceCalculatorPlugin@256ef705 > 2013-04-23 16:09:17,838 INFO > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader: > Current split being processed ColumnFamilySplit((9197470410121435301, '-1] > @[p00nosql02.00, p00nosql01.00]) > 2013-04-23 16:09:18,088 INFO org.apache.pig.data.SchemaTupleBackend: Key > [pig.schematuple] was not set... will not generate code. > 2013-04-23 16:09:19,784 INFO > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$Map: > Aliases being processed per job phase (AliasName[line,offset]): M: > data[12,7],null[-1,-1],filtered[14,11],null[-1,-1],c1[23,5],null[-1,-1],updated[111,10] > C: R: > 2013-04-23 17:35:11,199 INFO org.apache.hadoop.mapred.TaskLogsTruncater: > Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1 > 2013-04-23 17:35:11,384 INFO org.apache.hadoop.io.nativeio.NativeIO: > Initialized cache for UID to User mapping with a cache timeout of 14400 > seconds. > 2013-04-23 17:35:11,385 INFO org.apache.hadoop.io.nativeio.NativeIO: Got > UserName cassandra for UID 500 from the native implementation > 2013-04-23 17:35:11,417 WARN org.apache.hadoop.mapred.Child: Error running > child > java.lang.RuntimeException: TimedOutException() > at > org.apache.cassandra.hadoop.ColumnFamilyRecordReader$StaticRowIterator.maybeInit(ColumnFamilyRecordReader.java:384) > at > org.apache.cassandra.hadoop.ColumnFamilyRecordReader$StaticRowIterator.computeNext(ColumnFamilyRecordReader.java:390) > at > org.apache.cassandra.hadoop.ColumnFamilyRecordReader$StaticRowIterator.computeNext(ColumnFamilyRecordReader.java:313) > at > com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143) > at > com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138) > at > org.apache.cassandra.hadoop.ColumnFamilyRecordReader.getProgress(ColumnFamilyRecordReader.java:103) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.getProgress(PigRecordReader.java:169) > at > org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.getProgress(MapTask.java:514) > at > org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:539) > at > org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67) > at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143) > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370) > at org.apache.hadoop.mapred.Child$4.run(Child.java:255) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:396) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121) > at org.apache.hadoop.mapred.Child.main(Child.java:249) > Caused by: TimedOutException() > at > org.apache.cassandra.thrift.Cassandra$get_range_slices_result.read(Cassandra.java:12932) > at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78) > at > org.apache.cassandra.thrift.Cassandra$Client.recv_get_range_slices(Cassandra.java:734) > at > org.apache.cassandra.thrift.Cassandra$Client.get_range_slices(Cassandra.java:718) > at > org.apache.cassandra.hadoop.ColumnFamilyRecordReader$StaticRowIterator.maybeInit(ColumnFamilyRecordReader.java:346) > ... 17 more > 2013-04-23 17:35:11,427 INFO org.apache.hadoop.mapred.Task: Runnning cleanup > for the task > > These Two tasks hanged for long time and crashes with timeout exception. Very > interesting part is as follows > 2013-04-23 16:09:17,838 INFO > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader: > Current split being processed ColumnFamilySplit((9197470410121435301, '-1] > @[p00nosql02.00, p00nosql01.00]) > Why it's split data from two nodes? we have 6 nodes cassandra cluster + > hadoop slaves - every task should get local input split from local cassandra > - am i right? > > -- > Best regards > Shamim A. > > 24.04.2013, 10:59, "Shamim" <sre...@yandex.ru>: >> Hello Aron, >> We have build up our new cluster from the scratch with version 1.2 - >> partition murmor3. We are not using vnodes at all. >> Actually log is clean and nothing serious, now investigating logs and post >> soon if found something criminal >> >>>>> Our cluster is evenly partitioned (Murmur3Partitioner) > > >>>>> Murmor3Partitioner is only available in 1.2 and changing partitioners is >>>>> not supported. Did you change from Random Partitioner under 1.1? > > Are >>>>> you using virtual nodes in your 1.2 cluster ? > >>> We have roughly >>>>> 97million rows in our cluster. Why we are getting above behavior? Do you >>>>> have any suggestion or clue to trouble shoot in this issue? > > Can you >>>>> make some of the logs from the tasks available? > > Cheers > > -- >> >> --------------- > Aaron Morton > Freelance Cassandra Consultant > New >> Zealand > > @aaronmorton > http://www.thelastpickle.com > > On 23/04/2013, >> at 5:50 AM, Shamim wrote: > >> We are using Hadoop 1.0.3 and pig 0.11.1 >> version >> >> -- >> Best regards >> Shamim A. >> >> 22.04.2013, 21:48, >> "Shamim" : >> >>> Hello all, >>> recently we have upgrade our cluster (6 >> nodes) from cassandra version 1.1.6 to 1.2.1. Our cluster is evenly >> partitioned (Murmur3Partitioner). We are using pig for parse and compute >> aggregate data. >>> >>> When we submit job through pig, what i consistently >> see is that, while most of the task have 20-25k row assigned each (Map input >> records), only 2 of them (always 2 ) getting more than 2 million rows. This >> 2 tasks always complete 100% and hang for long time. Also most of the time >> we are getting killed task (2%) with TimeoutException. >>> >>> We increased >> rpc_timeout to 60000, also set cassandra.input.split.size=1024 but nothing >> help. >>> >>> We have roughly 97million rows in our cluster. Why we are >> getting above behavior? Do you have any suggestion or clue to trouble shoot >> in this issue? Any help will be highly thankful. Thankx in advance. >>> >>> >> -- >>> Best regards >>> Shamim A. -- Best regards >> Shamim A.