Re: Hadoop jobs and data locality

2013-05-07 Thread cscetbon.ext
I tried to use your quick workaround but the task is lasting really longer than before even if it uses 2 mappers in //. The fact is that there are 1000 tasks. Are you using vnodes ? I didn't try to disable them. Kind% Complete Num Tasks Pending Running CompleteKilled Fai

Re: Hadoop jobs and data locality

2013-05-07 Thread cscetbon.ext
I was going to open one. Great ! -- Cyril SCETBON On May 7, 2013, at 9:03 AM, Shamim mailto:sre...@yandex.ru>> wrote: I have created an issue in jira https://issues.apache.org/jira/browse/CASSANDRA-5544 ___

Re: Hadoop jobs and data locality

2013-05-07 Thread Shamim
I have created an issue in jira https://issues.apache.org/jira/browse/CASSANDRA-5544 -- Best regards   Shamim A. 06.05.2013, 22:26, "Shamim" : > I think It will be better to open a issue in jira > Best regards >   Shamim A. > >>  Unfortunately I've just tried with a new cluster with RandomPart

Re: Hadoop jobs and data locality

2013-05-06 Thread cscetbon.ext
Unfortunately I've just tried with a new cluster with RandomPartitioner and it doesn't work better : it may come from hadoop/pig modifications : 18:02:53|elia:hadoop cyril$ git diff --stat cassandra-1.1.5..cassandra-1.2.1 . .../apache/cassandra/hadoop/BulkOutputFormat.java | 27 +-- .../apac

Re: Hadoop jobs and data locality

2013-05-04 Thread Shamim
Hello,   We have also came across this issue in our dev environment, when we upgrade Cassandra from 1.1.5 to 1.2.1 version. I have mentioned this issue in few times in this forum but haven't got any answer yet. For quick work around you can use pig.splitCombination false in your pig script to av

Hadoop jobs and data locality

2013-05-03 Thread cscetbon.ext
Hi, I'm using Pig to calculate the sum of a columns from a columnfamily (scan of all rows) and I've read that input data locality is supported at http://wiki.apache.org/cassandra/HadoopSupport However when I execute my Pig script Hadoop assigns only one mapper to the task and not one mapper on