Thanks for tracking that down, Roland. I've created https://issues.apache.org/jira/browse/CASSANDRA-2347 to fix this.
On Wed, Mar 16, 2011 at 10:37 AM, Roland Gude <roland.g...@yoochoose.com> wrote: > I have applied the suggested changes in my local source tree and did run all > my testcases (the supplied ones as well as those with real data). > > They do work now. > > > > Von: Roland Gude [mailto:roland.g...@yoochoose.com] > Gesendet: Mittwoch, 16. März 2011 16:29 > > An: user@cassandra.apache.org > Betreff: AW: AW: problems while TimeUUIDType-index-querying with two > expressions > > > > With debugging into it i found something that might be the issue (please > correct me if I am wrong): > > In ColumnFamilyStore.java lines 1597 to 1613 is the code that checks whether > some column satisfies an index expression. > > In line 1608 it compares the value of the index expression with the value > given in the expression. > > > > For this comparison it utilizes the comparator of the columnfamily while it > should use the comparator of the Column validation class. > > > > private static boolean satisfies(ColumnFamily data, IndexClause clause, > IndexExpression first) > > { > > for (IndexExpression expression : clause.expressions) > > { > > // (we can skip "first" since we already know it's satisfied) > > if (expression == first) > > continue; > > // check column data vs expression > > IColumn column = data.getColumn(expression.column_name); > > if (column == null) > > return false; > > int v = data.getComparator().compare(column.value(), > expression.value); > > if (!satisfies(v, expression.op)) > > return false; > > } > > return true; > > } > > > > > > The line 1608 should be changed from: > > int v = data.getComparator().compare(column.value(), > expression.value); > > > > to > > int v = data.metadata().getValueValidator > (expression.column_name).compare(column.value(), expression.value); > > > > > > > > greetings roland > > > > > > Von: Roland Gude [mailto:roland.g...@yoochoose.com] > Gesendet: Mittwoch, 16. März 2011 14:50 > An: user@cassandra.apache.org > Betreff: AW: AW: problems while TimeUUIDType-index-querying with two > expressions > > > > Hi Aaron, > > > > now I am completely confused. > > The code that did not work for days now – like a miracle – works even > against the unpatched Cassandra 0.7.3 but the testcase still does not… > > There seems to be some randomness in whether it works or not (which is a bad > sign I think)… I will debug a little deeper into this and report anything I > find. > > > > Greetings, > > roland > > > > Von: aaron morton [mailto:aa...@thelastpickle.com] > Gesendet: Mittwoch, 16. März 2011 01:15 > An: user@cassandra.apache.org > Betreff: Re: AW: problems while TimeUUIDType-index-querying with two > expressions > > > > Have attached a patch > to https://issues.apache.org/jira/browse/CASSANDRA-2328 > > > > Can you give it a try ? You should not get a InvalidRequestException when > you send an invalid name or value in the query expression. > > > > Aaron > > > > On 16 Mar 2011, at 10:30, aaron morton wrote: > > > > Will have the Jira I created finished soon, it's a legitimate issue we > should be validating the column names and values when a ger_indexed_slice() > request is sent. The error in your original email shows that. > > > > WRT your code example. You are using the TimeUUID Validator for the column > name when creating the index expression, but are using a string serialiser > for the value... > > IndexedSlicesQuery<String, UUID, String> indexQuery = HFactory > .createIndexedSlicesQuery(keyspace, > stringSerializer, > UUID_SERIALIZER, stringSerializer); > indexQuery.addEqualsExpression(MANDATOR_UUID, mandator); > > But your schema is saying it is a bytes type... > > > > column_metadata=[{column_name: 00000000-0000-1000-0000-000000000000, > validation_class: BytesType, index_name: mandatorIndex, index_type: KEYS}, > {column_name: 00000001-0000-1000-0000-000000000000, validation_class: > BytesType, index_name: useridIndex, index_type: KEYS}];"On 15 Mar 2011, at > 22:41, > > > > Once I have the patch can you apply it and run your test again ? > > > > You may also want to ask on the Hector list if it automagically check you > are using the correct types when creating an IndexedSlicesQuery. > > > > Aaron > > > > Roland Gude wrote: > > > > Forgot to attach the source code… here it comes > > > > Von: Roland Gude [mailto:roland.g...@yoochoose.com] > Gesendet: Dienstag, 15. März 2011 10:39 > An: user@cassandra.apache.org > Betreff: AW: problems while TimeUUIDType-index-querying with two expressions > > > > Actually its not the column values that should be UUIDs in our case, but the > column keys. The CF uses TimeUUID ordering and the values are just some > ByteArrays. Even with changing the code to use UUIDSerializer instead of > serializing the UUIDs manually the issue still exists. > > > > As far as I can see, there is nothing wrong with the IndexExpression. > > using two Index expressions with key=TimedUUID and Value=anything does not > work > > using one index expression (any one of the other two) alone does work fine. > > > > I refactored Johannes code into a junit testcase. It needs the cluster > configured as described in Johannes mail. > > There are three cases. Two with one of the indexExpressions and one with > both index expression. The one with Both IndexExpression will never finish > and youz will see the exception in the Cassandra logs. > > > > Bye, > > roland > > > > Von: aaron morton [mailto:aa...@thelastpickle.com] > Gesendet: Dienstag, 15. März 2011 07:54 > An: user@cassandra.apache.org > Cc: Juergen Link; Roland Gude; her...@datastax.com > Betreff: Re: problems while TimeUUIDType-index-querying with two expressions > > > > Perfectly reasonable, > created https://issues.apache.org/jira/browse/CASSANDRA-2328 > > > > Aaron > > On 15 Mar 2011, at 16:52, Jonathan Ellis wrote: > > > > Sounds like we should send an InvalidRequestException then. > > On Mon, Mar 14, 2011 at 8:06 PM, aaron morton <aa...@thelastpickle.com> > wrote: > > It's failing to when comparing two TimeUUID values because on of them is not > > properly formatted. In this case it's comparing a stored value with the > > value passed in the get_indexed_slice() query expression. > > I'm going to assume it's the value passed for the expression. > > When you create the IndexedSlicesQuery this is incorrect > > IndexedSlicesQuery<String, byte[], byte[]> indexQuery = HFactory > > .createIndexedSlicesQuery(keyspace, > > stringSerializer, bytesSerializer, bytesSerializer); > > Use a UUIDSerializer for the last param and then pass the UUID you want to > > build the expressing. Rather than the string/byte thing you are passing > > Hope that helps. > > Aaron > > On 15 Mar 2011, at 04:17, Johannes Hoerle wrote: > > > > Hi all, > > > > in order to improve our queries, we started to use IndexedSliceQueries from > > the hector project (https://github.com/zznate/hector-examples). I followed > > the instructions for creating IndexedSlicesQuery with > > GetIndexedSlices.java. > > I created the corresponding CF with in a keyspace called “Keyspace1” ( > > “create keyspace Keyspace1;”) with: > > "create column family Indexed1 with column_type='Standard' and > > comparator='UTF8Type' and keys_cached=200000 and read_repair_chance=1.0 and > > rows_cached=20000 and column_metadata=[{column_name: birthdate, > > validation_class: LongType, index_name: dateIndex, index_type: > > KEYS},{column_name: birthmonth, validation_class: LongType, index_name: > > monthIndex, index_type: KEYS}];" > > and the example GetIndexedSlices.java worked fine. > > > > Output of CF Indexed1: > > --------------------------------------- > > [default@Keyspace1] list Indexed1; > > Using default limit of 100 > > ------------------- > > RowKey: fake_key_12 > > => (column=birthdate, value=1974, timestamp=1300110485826059) > > => (column=birthmonth, value=0, timestamp=1300110485826060) > > => (column=fake_column_0, value=66616b655f76616c75655f305f3132, > > timestamp=1300110485826056) > > => (column=fake_column_1, value=66616b655f76616c75655f315f3132, > > timestamp=1300110485826057) > > => (column=fake_column_2, value=66616b655f76616c75655f325f3132, > > timestamp=1300110485826058) > > ------------------- > > RowKey: fake_key_8 > > => (column=birthdate, value=1974, timestamp=1300110485826039) > > => (column=birthmonth, value=8, timestamp=1300110485826040) > > => (column=fake_column_0, value=66616b655f76616c75655f305f38, > > timestamp=1300110485826036) > > => (column=fake_column_1, value=66616b655f76616c75655f315f38, > > timestamp=1300110485826037) > > => (column=fake_column_2, value=66616b655f76616c75655f325f38, > > timestamp=1300110485826038) > > ------------------- > > .... > > > > > > Now to the problem: > > As we have another column format in our cluster (using TimeUUIDType as > > comparator in CF definition) I adapted the application to our schema on a > > cassandra-0.7.3 cluster. > > We use a manually defined UUID for a mandator id index > > (00000000-0000-1000-0000-000000000000) and another one for a userid index > > (00000001-0000-1000-0000-000000000000). It can be created with: > > "create column family ByUser with column_type='Standard' and > > comparator='TimeUUIDType' and keys_cached=200000 and read_repair_chance=1.0 > > and rows_cached=20000 and column_metadata=[{column_name: > > 00000000-0000-1000-0000-000000000000, validation_class: BytesType, > > index_name: mandatorIndex, index_type: KEYS}, {column_name: > > 00000001-0000-1000-0000-000000000000, validation_class: BytesType, > > index_name: useridIndex, index_type: KEYS}];" > > > > > > which looks in the cluster using cassandra-cli like this: > > > > [default@Keyspace1] describe keyspace; > > Keyspace: Keyspace1: > > Replication Strategy: org.apache.cassandra.locator.SimpleStrategy > > Replication Factor: 1 > > Column Families: > > ColumnFamily: ByUser > > Columns sorted by: org.apache.cassandra.db.marshal.TimeUUIDType > > Row cache size / save period: 20000.0/0 > > Key cache size / save period: 200000.0/14400 > > Memtable thresholds: 0.2953125/63/1440 > > GC grace seconds: 864000 > > Compaction min/max thresholds: 4/32 > > Read repair chance: 0.01 > > Built indexes: [ByUser.mandatorIndex, ByUser.useridIndex] > > Column Metadata: > > Column Name: 00000001-0000-1000-0000-000000000000 > > Validation Class: org.apache.cassandra.db.marshal.BytesType > > Index Name: useridIndex > > Index Type: KEYS > > Column Name: 00000000-0000-1000-0000-000000000000 > > Validation Class: org.apache.cassandra.db.marshal.BytesType > > Index Name: mandatorIndex > > Index Type: KEYS > > ColumnFamily: Indexed1 > > Columns sorted by: org.apache.cassandra.db.marshal.UTF8Type > > Row cache size / save period: 20000.0/0 > > Key cache size / save period: 200000.0/14400 > > Memtable thresholds: 0.2953125/63/1440 > > GC grace seconds: 864000 > > Compaction min/max thresholds: 4/32 > > Read repair chance: 0.01 > > Built indexes: [Indexed1.dateIndex, Indexed1.monthIndex] > > Column Metadata: > > Column Name: birthmonth (birthmonth) > > Validation Class: org.apache.cassandra.db.marshal.LongType > > Index Name: monthIndex > > Index Type: KEYS > > Column Name: birthdate (birthdate) > > Validation Class: org.apache.cassandra.db.marshal.LongType > > Index Name: dateIndex > > Index Type: KEYS > > [default@Keyspace1] list ByUser; > > Using default limit of 100 > > ------------------- > > RowKey: testMandator!!user01 > > => (column=00000000-0000-1000-0000-000000000000, > > value=746573744d616e6461746f72, timestamp=1300111213321000) > > => (column=00000001-0000-1000-0000-000000000000, value=757365723031, > > timestamp=1300111213322000) > > => (column=f064b480-495e-11e0-abc4-0024e89fa587, value=3135, > > timestamp=1300111213561000) > > > > 1 Row Returned. > > > > the values of the index colums 00000000-0000-1000-0000-000000000000 and > > 00000001-0000-1000-0000-000000000000 represent "testMandator" and and > > "user01" as bytes > > the third column is a randomly generated one with value "15" that are > > inserted in GetTimeUUIDIndexedSlices app. > > I attached both source codes, GetIndexedSlices and GetTimeUUIDIndexedSlices. > > Currently the second index expression for the userid index in > > GetTimeUUIDIndexedSlices.queryCf(...) method > > > > indexQuery.addEqualsExpression(asByteArray(MANDATOR_UUID), new > > StringSerializer().toBytes(mandator)); > > //indexQuery.addEqualsExpression(asByteArray(USERID_INDEX_UUID), new > > StringSerializer().toBytes(dummyUserId)); > > > > is commented out, so the GetTimeUUIDIndexedSlices will run. Using one > > IndexQuery works perfectly fine but as soon as I add a second eq, gt, gte or > > lt expression I get an IndexOutOfBoundsException (see below). > > > > This issue can be easily reproduced by > > - downloading the zznate example > > (https://github.com/zznate/hector-examples), > > - mavenizing it to an eclipse project with "mvn clean eclipse:eclipse", > > - importing it in eclipse and > > - letting it run against a locally running cassandra instance (v0.7.3) which > > has the default settings (no changes in the .yaml) > > > > I hope that someone can help me with this issue ... after a couple of days > > it's driving me bonkers. > > > > Thx in advance, > > Johannes > > > > > > Exception: > > ERROR 14:47:56,842 Error in ThreadPoolExecutor > > java.lang.RuntimeException: java.lang.IndexOutOfBoundsException: 6 > > at > > org.apache.cassandra.service.IndexScanVerbHandler.doVerb(IndexScanVer > > bHandler.java:51) > > at > > org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask. > > java:72) > > at > > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExec > > utor.java:886) > > at > > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor > > .java:908) > > at java.lang.Thread.run(Thread.java:619) > > Caused by: java.lang.IndexOutOfBoundsException: 6 > > at java.nio.HeapByteBuffer.get(HeapByteBuffer.java:121) > > at > > org.apache.cassandra.db.marshal.TimeUUIDType.compareTimestampBytes(Ti > > meUUIDType.java:56) > > at > > org.apache.cassandra.db.marshal.TimeUUIDType.compare(TimeUUIDType.jav > > a:45) > > at > > org.apache.cassandra.db.marshal.TimeUUIDType.compare(TimeUUIDType.jav > > a:29) > > at > > org.apache.cassandra.db.ColumnFamilyStore.satisfies(ColumnFamilyStore > > .java:1608) > > at > > org.apache.cassandra.db.ColumnFamilyStore.scan(ColumnFamilyStore.java > > :1552) > > at > > org.apache.cassandra.service.IndexScanVerbHandler.doVerb(IndexScanVer > > bHandler.java:42) > > ... 4 more > > ERROR 14:47:56,852 Fatal exception in thread Thread[ReadStage:14,5,main] > > java.lang.RuntimeException: java.lang.IndexOutOfBoundsException: 6 > > at > > org.apache.cassandra.service.IndexScanVerbHandler.doVerb(IndexScanVer > > bHandler.java:51) > > at > > org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask. > > java:72) > > at > > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExec > > utor.java:886) > > at > > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor > > .java:908) > > <GetIndexedSlices.java><GetTimeUUIDIndexedSlices.java> > > > > > -- > Jonathan Ellis > Project Chair, Apache Cassandra > co-founder of DataStax, the source for professional Cassandra support > http://www.datastax.com > > > > <GetTimeUUIDIndexedSlices.java> > > > > -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of DataStax, the source for professional Cassandra support http://www.datastax.com