The out of bounds error normally means you have columns names that are not valid time uuids.
Is that a possibility ? Aaron On 11/02/2011, at 5:55 AM, Bill Speirs <bill.spe...@gmail.com> wrote: > We attempted a compaction to see if that would improve read > performance (BTW: write performance is as expected, fast!). Here is > the result, an ArrayOutOfBounds exception: > > INFO 11:48:41,070 Compacting > [org.apache.cassandra.io.sstable.SSTableReader(path='/test/cassandra/data/Logging/DateIndex-e-7-Data.db'),org.apache.cassandra.io.sstable.SSTableReader(path='/test/cassandra/data/Logging/FieldIndex-e-9-Data.db'),org.apache.cassandra.io.sstable.SSTableReader(path='/test/cassandra/data/Logging/FieldIndex-e-10-Data.db'),org.apache.cassandra.io.sstable.SSTableReader(path='/test/cassandra/data/Logging/Messages-e-13-Data.db')] > > ERROR 11:48:41,080 Fatal exception in thread > Thread[CompactionExecutor:1,1,main] > java.lang.ArrayIndexOutOfBoundsException: 7 > at > org.apache.cassandra.db.marshal.TimeUUIDType.compareTimestampBytes(TimeUUIDType.java:58) > at > org.apache.cassandra.db.marshal.TimeUUIDType.compare(TimeUUIDType.java:45) > at > org.apache.cassandra.db.marshal.TimeUUIDType.compare(TimeUUIDType.java:29) > at > java.util.concurrent.ConcurrentSkipListMap$ComparableUsingComparator.compareTo(ConcurrentSkipListMap.java:606) > at > java.util.concurrent.ConcurrentSkipListMap.doPut(ConcurrentSkipListMap.java:878) > at > java.util.concurrent.ConcurrentSkipListMap.putIfAbsent(ConcurrentSkipListMap.java:1893) > at > org.apache.cassandra.db.ColumnFamily.addColumn(ColumnFamily.java:218) > at > org.apache.cassandra.db.ColumnFamilySerializer.deserializeColumns(ColumnFamilySerializer.java:130) > at > org.apache.cassandra.io.sstable.SSTableIdentityIterator.getColumnFamilyWithColumns(SSTableIdentityIterator.java:137) > at > org.apache.cassandra.io.PrecompactedRow.<init>(PrecompactedRow.java:78) > at > org.apache.cassandra.io.CompactionIterator.getCompactedRow(CompactionIterator.java:138) > at > org.apache.cassandra.io.CompactionIterator.getReduced(CompactionIterator.java:107) > at > org.apache.cassandra.io.CompactionIterator.getReduced(CompactionIterator.java:42) > at > org.apache.cassandra.utils.ReducingIterator.computeNext(ReducingIterator.java:73) > at > com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:136) > at > com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:131) > at > org.apache.commons.collections.iterators.FilterIterator.setNextObject(FilterIterator.java:183) > at > org.apache.commons.collections.iterators.FilterIterator.hasNext(FilterIterator.java:94) > at > org.apache.cassandra.db.CompactionManager.doCompaction(CompactionManager.java:312) > at > org.apache.cassandra.db.CompactionManager$1.call(CompactionManager.java:122) > at > org.apache.cassandra.db.CompactionManager$1.call(CompactionManager.java:92) > at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) > at java.util.concurrent.FutureTask.run(FutureTask.java:138) > at > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) > at java.lang.Thread.run(Thread.java:619) > > Does any of that mean anything to anyone? > > Thanks... > > Bill- > > On Thu, Feb 10, 2011 at 11:00 AM, Bill Speirs <bill.spe...@gmail.com> wrote: >> I have a 7 node setup with a replication factor of 1 and a read >> consistency of 1. I have two column families: Messages which stores >> millions of rows with a UUID for the row key, DateIndex which stores >> thousands of rows with a String as the row key. I perform 2 look-ups >> for my queries: >> >> 1) Fetch the row from DateIndex that includes the date I'm looking >> for. This returns 1,000 columns where the column names are the UUID of >> the messages >> 2) Do a multi-get (Hector client) using those 1,000 row keys I got >> from the first query. >> >> Query 1 is taking ~300ms to fetch 1,000 columns from a single row... >> respectable. However, query 2 is taking over 50s to perform 1,000 row >> look-ups! Also, when I scale down to 100 row look-ups for query 2, the >> time scales in a similar fashion, down to 5s. >> >> Am I doing something wrong here? It seems like taking 5s to look-up >> 100 rows in a distributed hash table is way too slow. >> >> Thoughts? >> >> Bill- >>