Tried to isolate the issue in testing environment, What I currently have:
That's a setup for test: CREATE KEYSPACE cascading_cassandra WITH replication = {'class' : 'SimpleStrategy', 'replication_factor' : 1}; USE cascading_cassandra; CREATE TABLE libraries (emitted_at timestamp, additional_info varchar, environment varchar, application varchar, type varchar, PRIMARY KEY (application, environment, type, emitted_at)) WITH COMPACT STORAGE; Next, insert some test data: (just for example) [INSERT INTO libraries (application, environment, type, additional_info, emitted_at) VALUES (?, ?, ?, ?, ?); [app env type 0 #inst "2013-04-20T13:01:04.935-00:00"]] If keys (e.q. "app" "env" "type") are all same across the dataset, it works correctly. As soon as I start varying keys, e.q. "app1", "app2", "app3" or others, I get the error with Message Length Exceeded. Does anyone have some ideas? Thanks for help! On Sat, Apr 20, 2013 at 1:56 PM, Oleksandr Petrov < oleksandr.pet...@gmail.com> wrote: > I can confirm running same problem. > > Tried ConfigHelper.setThriftMaxMessageLengthInMb();, and tuning server > side, reducing/increasing batch size. > > Here's stacktrace from Hadoop/Cassandra, maybe it could give a hint: > > Caused by: org.apache.thrift.protocol.TProtocolException: Message length > exceeded: 8 > at > org.apache.thrift.protocol.TBinaryProtocol.checkReadLength(TBinaryProtocol.java:393) > > at > org.apache.thrift.protocol.TBinaryProtocol.readBinary(TBinaryProtocol.java:363) > at org.apache.cassandra.thrift.Column.read(Column.java:528) > at > org.apache.cassandra.thrift.ColumnOrSuperColumn.read(ColumnOrSuperColumn.java:507) > at org.apache.cassandra.thrift.KeySlice.read(KeySlice.java:408) > at > org.apache.cassandra.thrift.Cassandra$get_paged_slice_result.read(Cassandra.java:14157) > at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78) > at > org.apache.cassandra.thrift.Cassandra$Client.recv_get_paged_slice(Cassandra.java:769) > at > org.apache.cassandra.thrift.Cassandra$Client.get_paged_slice(Cassandra.java:753) > at > org.apache.cassandra.hadoop.ColumnFamilyRecordReader$WideRowIterator.maybeInit(ColumnFamilyRecordReader.java:438) > > > On Thu, Apr 18, 2013 at 12:34 AM, Lanny Ripple <la...@spotright.com>wrote: > >> It's slow going finding the time to do so but I'm working on that. >> >> We do have another table that has one or sometimes two columns per row. >> We can run jobs on it without issue. I looked through >> org.apache.cassandra.hadoop code and don't see anything that's really >> changed since 1.1.5 (which was also using thrift-0.7) so something of a >> puzzler about what's going on. >> >> >> On Apr 17, 2013, at 2:47 PM, aaron morton <aa...@thelastpickle.com> >> wrote: >> >> > Can you reproduce this in a simple way ? >> > >> > Cheers >> > >> > ----------------- >> > Aaron Morton >> > Freelance Cassandra Consultant >> > New Zealand >> > >> > @aaronmorton >> > http://www.thelastpickle.com >> > >> > On 18/04/2013, at 5:50 AM, Lanny Ripple <la...@spotright.com> wrote: >> > >> >> That was our first thought. Using maven's dependency tree info we >> verified that we're using the expected (cass 1.2.3) jars >> >> >> >> $ mvn dependency:tree | grep thrift >> >> [INFO] | +- org.apache.thrift:libthrift:jar:0.7.0:compile >> >> [INFO] | \- org.apache.cassandra:cassandra-thrift:jar:1.2.3:compile >> >> >> >> I've also dumped the final command run by the hadoop we use (CDH3u5) >> and verified it's not sneaking thrift in on us. >> >> >> >> >> >> On Tue, Apr 16, 2013 at 4:36 PM, aaron morton <aa...@thelastpickle.com> >> wrote: >> >> Can you confirm the you are using the same thrift version that ships >> 1.2.3 ? >> >> >> >> Cheers >> >> >> >> ----------------- >> >> Aaron Morton >> >> Freelance Cassandra Consultant >> >> New Zealand >> >> >> >> @aaronmorton >> >> http://www.thelastpickle.com >> >> >> >> On 16/04/2013, at 10:17 AM, Lanny Ripple <la...@spotright.com> wrote: >> >> >> >>> A bump to say I found this >> >>> >> >>> >> http://stackoverflow.com/questions/15487540/pig-cassandra-message-length-exceeded >> >>> >> >>> so others are seeing similar behavior. >> >>> >> >>> From what I can see of org.apache.cassandra.hadoop nothing has >> changed since 1.1.5 when we didn't see such things but sure looks like >> there's a bug that's slipped in (or been uncovered) somewhere. I'll try to >> narrow down to a dataset and code that can reproduce. >> >>> >> >>> On Apr 10, 2013, at 6:29 PM, Lanny Ripple <la...@spotright.com> >> wrote: >> >>> >> >>>> We are using Astyanax in production but I cut back to just Hadoop >> and Cassandra to confirm it's a Cassandra (or our use of Cassandra) problem. >> >>>> >> >>>> We do have some extremely large rows but we went from everything >> working with 1.1.5 to almost everything carping with 1.2.3. Something has >> changed. Perhaps we were doing something wrong earlier that 1.2.3 exposed >> but surprises are never welcome in production. >> >>>> >> >>>> On Apr 10, 2013, at 8:10 AM, <moshe.kr...@barclays.com> wrote: >> >>>> >> >>>>> I also saw this when upgrading from C* 1.0 to 1.2.2, and from >> hector 0.6 to 0.8 >> >>>>> Turns out the Thrift message really was too long. >> >>>>> The mystery to me: Why no complaints in previous versions? Were >> some checks added in Thrift or Hector? >> >>>>> >> >>>>> -----Original Message----- >> >>>>> From: Lanny Ripple [mailto:la...@spotright.com] >> >>>>> Sent: Tuesday, April 09, 2013 6:17 PM >> >>>>> To: user@cassandra.apache.org >> >>>>> Subject: Thrift message length exceeded >> >>>>> >> >>>>> Hello, >> >>>>> >> >>>>> We have recently upgraded to Cass 1.2.3 from Cass 1.1.5. We ran >> sstableupgrades and got the ring on its feet and we are now seeing a new >> issue. >> >>>>> >> >>>>> When we run MapReduce jobs against practically any table we find >> the following errors: >> >>>>> >> >>>>> 2013-04-09 09:58:47,746 INFO >> org.apache.hadoop.util.NativeCodeLoader: Loaded the native-hadoop library >> >>>>> 2013-04-09 09:58:47,899 INFO >> org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing JVM Metrics with >> processName=MAP, sessionId= >> >>>>> 2013-04-09 09:58:48,021 INFO org.apache.hadoop.util.ProcessTree: >> setsid exited with exit code 0 >> >>>>> 2013-04-09 09:58:48,024 INFO org.apache.hadoop.mapred.Task: Using >> ResourceCalculatorPlugin : >> org.apache.hadoop.util.LinuxResourceCalculatorPlugin@4a48edb5 >> >>>>> 2013-04-09 09:58:50,475 INFO >> org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs' truncater >> with mapRetainSize=-1 and reduceRetainSize=-1 >> >>>>> 2013-04-09 09:58:50,477 WARN org.apache.hadoop.mapred.Child: Error >> running child >> >>>>> java.lang.RuntimeException: org.apache.thrift.TException: Message >> length exceeded: 106 >> >>>>> at >> org.apache.cassandra.hadoop.ColumnFamilyRecordReader$StaticRowIterator.maybeInit(ColumnFamilyRecordReader.java:384) >> >>>>> at >> org.apache.cassandra.hadoop.ColumnFamilyRecordReader$StaticRowIterator.computeNext(ColumnFamilyRecordReader.java:390) >> >>>>> at >> org.apache.cassandra.hadoop.ColumnFamilyRecordReader$StaticRowIterator.computeNext(ColumnFamilyRecordReader.java:313) >> >>>>> at >> com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143) >> >>>>> at >> com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138) >> >>>>> at >> org.apache.cassandra.hadoop.ColumnFamilyRecordReader.getProgress(ColumnFamilyRecordReader.java:103) >> >>>>> at >> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.getProgress(MapTask.java:444) >> >>>>> at >> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:460) >> >>>>> at >> org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67) >> >>>>> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143) >> >>>>> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:647) >> >>>>> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323) >> >>>>> at org.apache.hadoop.mapred.Child$4.run(Child.java:266) >> >>>>> at java.security.AccessController.doPrivileged(Native Method) >> >>>>> at javax.security.auth.Subject.doAs(Subject.java:396) >> >>>>> at >> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1278) >> >>>>> at org.apache.hadoop.mapred.Child.main(Child.java:260) >> >>>>> Caused by: org.apache.thrift.TException: Message length exceeded: >> 106 >> >>>>> at >> org.apache.thrift.protocol.TBinaryProtocol.checkReadLength(TBinaryProtocol.java:393) >> >>>>> at >> org.apache.thrift.protocol.TBinaryProtocol.readBinary(TBinaryProtocol.java:363) >> >>>>> at org.apache.cassandra.thrift.Column.read(Column.java:528) >> >>>>> at >> org.apache.cassandra.thrift.ColumnOrSuperColumn.read(ColumnOrSuperColumn.java:507) >> >>>>> at org.apache.cassandra.thrift.KeySlice.read(KeySlice.java:408) >> >>>>> at >> org.apache.cassandra.thrift.Cassandra$get_range_slices_result.read(Cassandra.java:12905) >> >>>>> at >> org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78) >> >>>>> at >> org.apache.cassandra.thrift.Cassandra$Client.recv_get_range_slices(Cassandra.java:734) >> >>>>> at >> org.apache.cassandra.thrift.Cassandra$Client.get_range_slices(Cassandra.java:718) >> >>>>> at >> org.apache.cassandra.hadoop.ColumnFamilyRecordReader$StaticRowIterator.maybeInit(ColumnFamilyRecordReader.java:346) >> >>>>> ... 16 more >> >>>>> 2013-04-09 09:58:50,481 INFO org.apache.hadoop.mapred.Task: >> Runnning cleanup for the task >> >>>>> >> >>>>> The message length listed on each failed job differs (not always >> 106). Jobs that used to run fine now fail with code compiled against cass >> 1.2.3 (and work fine if compiled against 1.1.5 and run against the 1.2.3 >> servers in production). I'm using the following setup to configure the job: >> >>>>> >> >>>>> def cassConfig(job: Job) { >> >>>>> val conf = job.getConfiguration() >> >>>>> >> >>>>> ConfigHelper.setInputRpcPort(conf, "" + 9160) >> >>>>> ConfigHelper.setInputInitialAddress(conf, Config.hostip) >> >>>>> >> >>>>> ConfigHelper.setInputPartitioner(conf, >> "org.apache.cassandra.dht.RandomPartitioner") >> >>>>> ConfigHelper.setInputColumnFamily(conf, Config.keyspace, >> Config.cfname) >> >>>>> >> >>>>> val pred = { >> >>>>> val range = new SliceRange() >> >>>>> .setStart("".getBytes("UTF-8")) >> >>>>> .setFinish("".getBytes("UTF-8")) >> >>>>> .setReversed(false) >> >>>>> .setCount(4096 * 1000) >> >>>>> >> >>>>> new SlicePredicate().setSlice_range(range) >> >>>>> } >> >>>>> >> >>>>> ConfigHelper.setInputSlicePredicate(conf, pred) >> >>>>> } >> >>>>> >> >>>>> The job consists only of a mapper that increments counters for each >> row and associated columns so all I'm really doing is exercising >> ColumnFamilyRecordReader. >> >>>>> >> >>>>> Has anyone else seen this? Is there a workaround/fix to get our >> jobs running? >> >>>>> >> >>>>> Thanks >> >>>>> _______________________________________________ >> >>>>> >> >>>>> This message may contain information that is confidential or >> privileged. If you are not an intended recipient of this message, please >> delete it and any attachments, and notify the sender that you have received >> it in error. Unless specifically stated in the message or otherwise >> indicated, you may not duplicate, redistribute or forward this message or >> any portion thereof, including any attachments, by any means to any other >> person, including any retail investor or customer. This message is not a >> recommendation, advice, offer or solicitation, to buy/sell any product or >> service, and is not an official confirmation of any transaction. Any >> opinions presented are solely those of the author and do not necessarily >> represent those of Barclays. >> >>>>> >> >>>>> This message is subject to terms available at: >> www.barclays.com/emaildisclaimer and, if received from Barclays' Sales >> or Trading desk, the terms available at: >> www.barclays.com/salesandtradingdisclaimer/. By messaging with Barclays >> you consent to the foregoing. Barclays Bank PLC is a company registered in >> England (number 1026167) with its registered office at 1 Churchill Place, >> London, E14 5HP. This email may relate to or be sent from other members of >> the Barclays group. >> >>>>> >> >>>>> _______________________________________________ >> >>>> >> >>> >> >> >> >> >> > >> >> > > > -- > alex p > -- alex p