NPE during compaction in compare

Paul Ingalls Wed, 24 Jul 2013 13:54:03 -0700

Hey Chris,

so I just tried dropping all my data and converting my column families to use 
leveled compaction.  Now I'm getting exceptions like the following once I start 
inserting data.  Have you seen these?




ERROR 13:13:25,616 Exception in thread Thread[CompactionExecutor:34,1,main]
java.lang.NullPointerException
        at 
org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:69)
        at 
org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:31)
        at 
org.apache.cassandra.db.RangeTombstoneList.insertFrom(RangeTombstoneList.java:396)
        at 
org.apache.cassandra.db.RangeTombstoneList.addAll(RangeTombstoneList.java:205)
        at org.apache.cassandra.db.DeletionInfo.add(DeletionInfo.java:180)
        at 
org.apache.cassandra.db.AbstractThreadUnsafeSortedColumns.delete(AbstractThreadUnsafeSortedColumns.java:40)
        at 
org.apache.cassandra.db.AbstractColumnContainer.delete(AbstractColumnContainer.java:51)
        at 
org.apache.cassandra.db.AbstractColumnContainer.delete(AbstractColumnContainer.java:46)
        at 
org.apache.cassandra.db.compaction.PrecompactedRow.merge(PrecompactedRow.java:115)
        at 
org.apache.cassandra.db.compaction.PrecompactedRow.<init>(PrecompactedRow.java:98)
        at 
org.apache.cassandra.db.compaction.CompactionController.getCompactedRow(CompactionController.java:160)
        at 
org.apache.cassandra.db.compaction.CompactionIterable$Reducer.getReduced(CompactionIterable.java:76)
        at 
org.apache.cassandra.db.compaction.CompactionIterable$Reducer.getReduced(CompactionIterable.java:57)
        at 
org.apache.cassandra.utils.MergeIterator$ManyToOne.consume(MergeIterator.java:114)
        at 
org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:97)
        at 
com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
        at 
com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
        at 
org.apache.cassandra.db.compaction.CompactionTask.runWith(CompactionTask.java:145)
        at 
org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48)
        at 
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
        at 
org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:58)
        at 
org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:60)
        at 
org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionTask.run(CompactionManager.java:211)
        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439)
        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
        at java.util.concurrent.FutureTask.run(FutureTask.java:138)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
        at java.lang.Thread.run(Thread.java:680)
 
and

ERROR 13:17:11,327 Exception in thread Thread[CompactionExecutor:45,1,main]
java.lang.ArrayIndexOutOfBoundsException: 2
        at 
org.apache.cassandra.db.RangeTombstoneList.insertFrom(RangeTombstoneList.java:396)
        at 
org.apache.cassandra.db.RangeTombstoneList.addAll(RangeTombstoneList.java:205)
        at org.apache.cassandra.db.DeletionInfo.add(DeletionInfo.java:180)
        at 
org.apache.cassandra.db.AbstractThreadUnsafeSortedColumns.delete(AbstractThreadUnsafeSortedColumns.java:40)
        at 
org.apache.cassandra.db.AbstractColumnContainer.delete(AbstractColumnContainer.java:51)
        at 
org.apache.cassandra.db.AbstractColumnContainer.delete(AbstractColumnContainer.java:46)
        at 
org.apache.cassandra.db.compaction.PrecompactedRow.merge(PrecompactedRow.java:115)
        at 
org.apache.cassandra.db.compaction.PrecompactedRow.<init>(PrecompactedRow.java:98)
        at 
org.apache.cassandra.db.compaction.CompactionController.getCompactedRow(CompactionController.java:160)
        at 
org.apache.cassandra.db.compaction.CompactionIterable$Reducer.getReduced(CompactionIterable.java:76)
        at 
org.apache.cassandra.db.compaction.CompactionIterable$Reducer.getReduced(CompactionIterable.java:57)
        at 
org.apache.cassandra.utils.MergeIterator$ManyToOne.consume(MergeIterator.java:114)
        at 
org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:97)
        at 
com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
        at 
com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
        at 
org.apache.cassandra.db.compaction.CompactionTask.runWith(CompactionTask.java:145)
        at 
org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48)
        at 
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
        at 
org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:58)
        at 
org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:60)
        at 
org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionTask.run(CompactionManager.java:211)
        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439)
        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
        at java.util.concurrent.FutureTask.run(FutureTask.java:138)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
        at java.lang.Thread.run(Thread.java:680)



On Jul 24, 2013, at 10:10 AM, Christopher Wirt <chris.w...@struq.com> wrote:

> We found the performance of collections to not be great and needed a quick 
> solution.
>  
> We’ve always used the levelled compaction strategy where you declare a 
> sstable_size_in_mb not min_compaction_threshold. Much better for our use case.
> http://www.datastax.com/dev/blog/when-to-use-leveled-compaction
> We are read-heavy latency sensitive people
> Lots of TTL’ing
> Few writes compared to reads.
>  
>  
> From: Paul Ingalls [mailto:paulinga...@gmail.com] 
> Sent: 24 July 2013 17:43
> To: user@cassandra.apache.org
> Subject: Re: disappointed
>  
> Hi Chris,
>  
> Thanks for the response!
>  
> What kind of challenges did you run into that kept you from using collections?
>  
> I currently and running 4 physical nodes, same as I was with case 1.1.6.  I'm 
> using size tiered compaction.  Would changing to level tiered with a large 
> minimum make a big difference, or would it just push the problem off till 
> later?
>  
> Yeah, I have run into problems dropping schemas before as well.  I was 
> careful this time to start with an empty db folder…
>  
> Glad you were successful in your transition…:)
>  
> Paul
>  
> On Jul 24, 2013, at 4:12 AM, "Christopher Wirt" <chris.w...@struq.com> wrote:
> 
> 
> Hi Paul,
>  
> Sorry to hear you’re having a low point.
>  
> We ended up not using the collection features of 1.2.
> Instead storing a compressed string containing the map and handling client 
> side.
>  
> We only have fixed schema short rows so no experience with large row 
> compaction.
>  
> File descriptors have never got that high for us. But, if you only have a 
> couple physical nodes with loads of data and small ss-tables maybe they could 
> get that high?
>  
> Only time I’ve had file descriptors get out of hand was then compaction got 
> slightly confused with a new schema when I dropped and recreated instead of 
> truncating. https://issues.apache.org/jira/browse/CASSANDRA-4857 restarting 
> the node fixed the issue.
>  
>  
> From my limited experience I think Cassandra is a dangerous choice for an 
> young limited funding/experience start-up expecting to scale fast. We are a 
> fairly mature start-up with funding. We’ve just spent 3-5 months moving from 
> Mongo to Cassandra. It’s been expensive and painful getting Cassandra to read 
> like Mongo, but we’ve made it J
>  
>  
>  
>  
> From: Paul Ingalls [mailto:paulinga...@gmail.com] 
> Sent: 24 July 2013 06:01
> To: user@cassandra.apache.org
> Subject: disappointed
>  
> I want to check in.  I'm sad, mad and afraid.  I've been trying to get a 1.2 
> cluster up and working with my data set for three weeks with no success.  
> I've been running a 1.1 cluster for 8 months now with no hiccups, but for me 
> at least 1.2 has been a disaster.  I had high hopes for leveraging the new 
> features of 1.2, specifically vnodes and collections.   But at this point I 
> can't release my system into production, and will probably need to find a new 
> back end.  As a small startup, this could be catastrophic.  I'm mostly mad at 
> myself.  I took a risk moving to the new tech.  I forgot sometimes when you 
> gamble, you lose.
>  
> First, the performance of 1.2.6 was horrible when using collections.  I 
> wasn't able to push through 500k rows before the cluster became unusable.  
> With a lot of digging, and way too much time, I discovered I was hitting a 
> bug that had just been fixed, but was unreleased.  This scared me, because 
> the release was already at 1.2.6 and I would have expected something as 
> https://issues.apache.org/jira/browse/CASSANDRA-5677 would have been 
> addressed long before.  But gamely I grabbed the latest code from the 1.2 
> branch, built it and I was finally able to get past half a million rows.  
>  
> But, then I hit ~4 million rows, and a multitude of problems.  Even with the 
> fix above, I was still seeing a ton of compactions failing, specifically the 
> ones for large rows.  Not a single large row will compact, they all assert 
> with the wrong size.  Worse, and this is what kills the whole thing, I keep 
> hitting a wall with open files, even after dumping the whole DB, dropping 
> vnodes and trying again.  Seriously, 650k open file descriptors?  When it 
> hits this limit, the whole DB craps out and is basically unusable.  This 
> isn't that many rows.  I have close to a half a billion in 1.1…
>  
> I'm now at a standstill.  I figure I have two options unless someone here can 
> help me.  Neither of them involve 1.2.  I can either go back to 1.1 and 
> remove the features that collections added to my service, or I find another 
> data backend that has similar performance characteristics to cassandra but 
> allows collections type behavior in a scalable manner.  Cause as far as I can 
> tell, 1.2 doesn't scale.  Which makes me sad, I was proud of what I 
> accomplished with 1.1….
>  
> Does anyone know why there are so many open file descriptors?  Any ideas on 
> why a large row won't compact?
>  
> Paul

NPE during compaction in compare

Reply via email to