We are using Hadoop 1.0.3 and pig 0.11.1 version -- Best regards Shamim A.
22.04.2013, 21:48, "Shamim" <sre...@yandex.ru>: > Hello all, > recently we have upgrade our cluster (6 nodes) from cassandra version 1.1.6 > to 1.2.1. Our cluster is evenly partitioned (Murmur3Partitioner). We are > using pig for parse and compute aggregate data. > > When we submit job through pig, what i consistently see is that, while most > of the task have 20-25k row assigned each (Map input records), only 2 of them > (always 2 ) getting more than 2 million rows. This 2 tasks always complete > 100% and hang for long time. Also most of the time we are getting killed task > (2%) with TimeoutException. > > We increased rpc_timeout to 60000, also set cassandra.input.split.size=1024 > but nothing help. > > We have roughly 97million rows in our cluster. Why we are getting above > behavior? Do you have any suggestion or clue to trouble shoot in this issue? > Any help will be highly thankful. Thankx in advance. > > -- > Best regards > Shamim A.