I can confirm running same problem. Tried ConfigHelper.setThriftMaxMessageLengthInMb();, and tuning server side, reducing/increasing batch size.
Here's stacktrace from Hadoop/Cassandra, maybe it could give a hint: Caused by: org.apache.thrift.protocol.TProtocolException: Message length exceeded: 8 at org.apache.thrift.protocol.TBinaryProtocol.checkReadLength(TBinaryProtocol.java:393) at org.apache.thrift.protocol.TBinaryProtocol.readBinary(TBinaryProtocol.java:363) at org.apache.cassandra.thrift.Column.read(Column.java:528) at org.apache.cassandra.thrift.ColumnOrSuperColumn.read(ColumnOrSuperColumn.java:507) at org.apache.cassandra.thrift.KeySlice.read(KeySlice.java:408) at org.apache.cassandra.thrift.Cassandra$get_paged_slice_result.read(Cassandra.java:14157) at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78) at org.apache.cassandra.thrift.Cassandra$Client.recv_get_paged_slice(Cassandra.java:769) at org.apache.cassandra.thrift.Cassandra$Client.get_paged_slice(Cassandra.java:753) at org.apache.cassandra.hadoop.ColumnFamilyRecordReader$WideRowIterator.maybeInit(ColumnFamilyRecordReader.java:438) On Thu, Apr 18, 2013 at 12:34 AM, Lanny Ripple <la...@spotright.com> wrote: > It's slow going finding the time to do so but I'm working on that. > > We do have another table that has one or sometimes two columns per row. > We can run jobs on it without issue. I looked through > org.apache.cassandra.hadoop code and don't see anything that's really > changed since 1.1.5 (which was also using thrift-0.7) so something of a > puzzler about what's going on. > > > On Apr 17, 2013, at 2:47 PM, aaron morton <aa...@thelastpickle.com> wrote: > > > Can you reproduce this in a simple way ? > > > > Cheers > > > > ----------------- > > Aaron Morton > > Freelance Cassandra Consultant > > New Zealand > > > > @aaronmorton > > http://www.thelastpickle.com > > > > On 18/04/2013, at 5:50 AM, Lanny Ripple <la...@spotright.com> wrote: > > > >> That was our first thought. Using maven's dependency tree info we > verified that we're using the expected (cass 1.2.3) jars > >> > >> $ mvn dependency:tree | grep thrift > >> [INFO] | +- org.apache.thrift:libthrift:jar:0.7.0:compile > >> [INFO] | \- org.apache.cassandra:cassandra-thrift:jar:1.2.3:compile > >> > >> I've also dumped the final command run by the hadoop we use (CDH3u5) > and verified it's not sneaking thrift in on us. > >> > >> > >> On Tue, Apr 16, 2013 at 4:36 PM, aaron morton <aa...@thelastpickle.com> > wrote: > >> Can you confirm the you are using the same thrift version that ships > 1.2.3 ? > >> > >> Cheers > >> > >> ----------------- > >> Aaron Morton > >> Freelance Cassandra Consultant > >> New Zealand > >> > >> @aaronmorton > >> http://www.thelastpickle.com > >> > >> On 16/04/2013, at 10:17 AM, Lanny Ripple <la...@spotright.com> wrote: > >> > >>> A bump to say I found this > >>> > >>> > http://stackoverflow.com/questions/15487540/pig-cassandra-message-length-exceeded > >>> > >>> so others are seeing similar behavior. > >>> > >>> From what I can see of org.apache.cassandra.hadoop nothing has changed > since 1.1.5 when we didn't see such things but sure looks like there's a > bug that's slipped in (or been uncovered) somewhere. I'll try to narrow > down to a dataset and code that can reproduce. > >>> > >>> On Apr 10, 2013, at 6:29 PM, Lanny Ripple <la...@spotright.com> wrote: > >>> > >>>> We are using Astyanax in production but I cut back to just Hadoop and > Cassandra to confirm it's a Cassandra (or our use of Cassandra) problem. > >>>> > >>>> We do have some extremely large rows but we went from everything > working with 1.1.5 to almost everything carping with 1.2.3. Something has > changed. Perhaps we were doing something wrong earlier that 1.2.3 exposed > but surprises are never welcome in production. > >>>> > >>>> On Apr 10, 2013, at 8:10 AM, <moshe.kr...@barclays.com> wrote: > >>>> > >>>>> I also saw this when upgrading from C* 1.0 to 1.2.2, and from hector > 0.6 to 0.8 > >>>>> Turns out the Thrift message really was too long. > >>>>> The mystery to me: Why no complaints in previous versions? Were some > checks added in Thrift or Hector? > >>>>> > >>>>> -----Original Message----- > >>>>> From: Lanny Ripple [mailto:la...@spotright.com] > >>>>> Sent: Tuesday, April 09, 2013 6:17 PM > >>>>> To: user@cassandra.apache.org > >>>>> Subject: Thrift message length exceeded > >>>>> > >>>>> Hello, > >>>>> > >>>>> We have recently upgraded to Cass 1.2.3 from Cass 1.1.5. We ran > sstableupgrades and got the ring on its feet and we are now seeing a new > issue. > >>>>> > >>>>> When we run MapReduce jobs against practically any table we find the > following errors: > >>>>> > >>>>> 2013-04-09 09:58:47,746 INFO > org.apache.hadoop.util.NativeCodeLoader: Loaded the native-hadoop library > >>>>> 2013-04-09 09:58:47,899 INFO > org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing JVM Metrics with > processName=MAP, sessionId= > >>>>> 2013-04-09 09:58:48,021 INFO org.apache.hadoop.util.ProcessTree: > setsid exited with exit code 0 > >>>>> 2013-04-09 09:58:48,024 INFO org.apache.hadoop.mapred.Task: Using > ResourceCalculatorPlugin : > org.apache.hadoop.util.LinuxResourceCalculatorPlugin@4a48edb5 > >>>>> 2013-04-09 09:58:50,475 INFO > org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs' truncater > with mapRetainSize=-1 and reduceRetainSize=-1 > >>>>> 2013-04-09 09:58:50,477 WARN org.apache.hadoop.mapred.Child: Error > running child > >>>>> java.lang.RuntimeException: org.apache.thrift.TException: Message > length exceeded: 106 > >>>>> at > org.apache.cassandra.hadoop.ColumnFamilyRecordReader$StaticRowIterator.maybeInit(ColumnFamilyRecordReader.java:384) > >>>>> at > org.apache.cassandra.hadoop.ColumnFamilyRecordReader$StaticRowIterator.computeNext(ColumnFamilyRecordReader.java:390) > >>>>> at > org.apache.cassandra.hadoop.ColumnFamilyRecordReader$StaticRowIterator.computeNext(ColumnFamilyRecordReader.java:313) > >>>>> at > com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143) > >>>>> at > com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138) > >>>>> at > org.apache.cassandra.hadoop.ColumnFamilyRecordReader.getProgress(ColumnFamilyRecordReader.java:103) > >>>>> at > org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.getProgress(MapTask.java:444) > >>>>> at > org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:460) > >>>>> at > org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67) > >>>>> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143) > >>>>> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:647) > >>>>> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323) > >>>>> at org.apache.hadoop.mapred.Child$4.run(Child.java:266) > >>>>> at java.security.AccessController.doPrivileged(Native Method) > >>>>> at javax.security.auth.Subject.doAs(Subject.java:396) > >>>>> at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1278) > >>>>> at org.apache.hadoop.mapred.Child.main(Child.java:260) > >>>>> Caused by: org.apache.thrift.TException: Message length exceeded: 106 > >>>>> at > org.apache.thrift.protocol.TBinaryProtocol.checkReadLength(TBinaryProtocol.java:393) > >>>>> at > org.apache.thrift.protocol.TBinaryProtocol.readBinary(TBinaryProtocol.java:363) > >>>>> at org.apache.cassandra.thrift.Column.read(Column.java:528) > >>>>> at > org.apache.cassandra.thrift.ColumnOrSuperColumn.read(ColumnOrSuperColumn.java:507) > >>>>> at org.apache.cassandra.thrift.KeySlice.read(KeySlice.java:408) > >>>>> at > org.apache.cassandra.thrift.Cassandra$get_range_slices_result.read(Cassandra.java:12905) > >>>>> at > org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78) > >>>>> at > org.apache.cassandra.thrift.Cassandra$Client.recv_get_range_slices(Cassandra.java:734) > >>>>> at > org.apache.cassandra.thrift.Cassandra$Client.get_range_slices(Cassandra.java:718) > >>>>> at > org.apache.cassandra.hadoop.ColumnFamilyRecordReader$StaticRowIterator.maybeInit(ColumnFamilyRecordReader.java:346) > >>>>> ... 16 more > >>>>> 2013-04-09 09:58:50,481 INFO org.apache.hadoop.mapred.Task: Runnning > cleanup for the task > >>>>> > >>>>> The message length listed on each failed job differs (not always > 106). Jobs that used to run fine now fail with code compiled against cass > 1.2.3 (and work fine if compiled against 1.1.5 and run against the 1.2.3 > servers in production). I'm using the following setup to configure the job: > >>>>> > >>>>> def cassConfig(job: Job) { > >>>>> val conf = job.getConfiguration() > >>>>> > >>>>> ConfigHelper.setInputRpcPort(conf, "" + 9160) > >>>>> ConfigHelper.setInputInitialAddress(conf, Config.hostip) > >>>>> > >>>>> ConfigHelper.setInputPartitioner(conf, > "org.apache.cassandra.dht.RandomPartitioner") > >>>>> ConfigHelper.setInputColumnFamily(conf, Config.keyspace, > Config.cfname) > >>>>> > >>>>> val pred = { > >>>>> val range = new SliceRange() > >>>>> .setStart("".getBytes("UTF-8")) > >>>>> .setFinish("".getBytes("UTF-8")) > >>>>> .setReversed(false) > >>>>> .setCount(4096 * 1000) > >>>>> > >>>>> new SlicePredicate().setSlice_range(range) > >>>>> } > >>>>> > >>>>> ConfigHelper.setInputSlicePredicate(conf, pred) > >>>>> } > >>>>> > >>>>> The job consists only of a mapper that increments counters for each > row and associated columns so all I'm really doing is exercising > ColumnFamilyRecordReader. > >>>>> > >>>>> Has anyone else seen this? Is there a workaround/fix to get our > jobs running? > >>>>> > >>>>> Thanks > >>>>> _______________________________________________ > >>>>> > >>>>> This message may contain information that is confidential or > privileged. If you are not an intended recipient of this message, please > delete it and any attachments, and notify the sender that you have received > it in error. Unless specifically stated in the message or otherwise > indicated, you may not duplicate, redistribute or forward this message or > any portion thereof, including any attachments, by any means to any other > person, including any retail investor or customer. This message is not a > recommendation, advice, offer or solicitation, to buy/sell any product or > service, and is not an official confirmation of any transaction. Any > opinions presented are solely those of the author and do not necessarily > represent those of Barclays. > >>>>> > >>>>> This message is subject to terms available at: > www.barclays.com/emaildisclaimer and, if received from Barclays' Sales or > Trading desk, the terms available at: > www.barclays.com/salesandtradingdisclaimer/. By messaging with Barclays > you consent to the foregoing. Barclays Bank PLC is a company registered in > England (number 1026167) with its registered office at 1 Churchill Place, > London, E14 5HP. This email may relate to or be sent from other members of > the Barclays group. > >>>>> > >>>>> _______________________________________________ > >>>> > >>> > >> > >> > > > > -- alex p