Good work. Aaron
On 17/03/2011, at 4:37 PM, Jonathan Ellis <jbel...@gmail.com> wrote: > Thanks for tracking that down, Roland. I've created > https://issues.apache.org/jira/browse/CASSANDRA-2347 to fix this. > > On Wed, Mar 16, 2011 at 10:37 AM, Roland Gude <roland.g...@yoochoose.com> > wrote: >> I have applied the suggested changes in my local source tree and did run all >> my testcases (the supplied ones as well as those with real data). >> >> They do work now. >> >> >> >> Von: Roland Gude [mailto:roland.g...@yoochoose.com] >> Gesendet: Mittwoch, 16. März 2011 16:29 >> >> An: user@cassandra.apache.org >> Betreff: AW: AW: problems while TimeUUIDType-index-querying with two >> expressions >> >> >> >> With debugging into it i found something that might be the issue (please >> correct me if I am wrong): >> >> In ColumnFamilyStore.java lines 1597 to 1613 is the code that checks whether >> some column satisfies an index expression. >> >> In line 1608 it compares the value of the index expression with the value >> given in the expression. >> >> >> >> For this comparison it utilizes the comparator of the columnfamily while it >> should use the comparator of the Column validation class. >> >> >> >> private static boolean satisfies(ColumnFamily data, IndexClause clause, >> IndexExpression first) >> >> { >> >> for (IndexExpression expression : clause.expressions) >> >> { >> >> // (we can skip "first" since we already know it's satisfied) >> >> if (expression == first) >> >> continue; >> >> // check column data vs expression >> >> IColumn column = data.getColumn(expression.column_name); >> >> if (column == null) >> >> return false; >> >> int v = data.getComparator().compare(column.value(), >> expression.value); >> >> if (!satisfies(v, expression.op)) >> >> return false; >> >> } >> >> return true; >> >> } >> >> >> >> >> >> The line 1608 should be changed from: >> >> int v = data.getComparator().compare(column.value(), >> expression.value); >> >> >> >> to >> >> int v = data.metadata().getValueValidator >> (expression.column_name).compare(column.value(), expression.value); >> >> >> >> >> >> >> >> greetings roland >> >> >> >> >> >> Von: Roland Gude [mailto:roland.g...@yoochoose.com] >> Gesendet: Mittwoch, 16. März 2011 14:50 >> An: user@cassandra.apache.org >> Betreff: AW: AW: problems while TimeUUIDType-index-querying with two >> expressions >> >> >> >> Hi Aaron, >> >> >> >> now I am completely confused. >> >> The code that did not work for days now – like a miracle – works even >> against the unpatched Cassandra 0.7.3 but the testcase still does not… >> >> There seems to be some randomness in whether it works or not (which is a bad >> sign I think)… I will debug a little deeper into this and report anything I >> find. >> >> >> >> Greetings, >> >> roland >> >> >> >> Von: aaron morton [mailto:aa...@thelastpickle.com] >> Gesendet: Mittwoch, 16. März 2011 01:15 >> An: user@cassandra.apache.org >> Betreff: Re: AW: problems while TimeUUIDType-index-querying with two >> expressions >> >> >> >> Have attached a patch >> to https://issues.apache.org/jira/browse/CASSANDRA-2328 >> >> >> >> Can you give it a try ? You should not get a InvalidRequestException when >> you send an invalid name or value in the query expression. >> >> >> >> Aaron >> >> >> >> On 16 Mar 2011, at 10:30, aaron morton wrote: >> >> >> >> Will have the Jira I created finished soon, it's a legitimate issue we >> should be validating the column names and values when a ger_indexed_slice() >> request is sent. The error in your original email shows that. >> >> >> >> WRT your code example. You are using the TimeUUID Validator for the column >> name when creating the index expression, but are using a string serialiser >> for the value... >> >> IndexedSlicesQuery<String, UUID, String> indexQuery = HFactory >> .createIndexedSlicesQuery(keyspace, >> stringSerializer, >> UUID_SERIALIZER, stringSerializer); >> indexQuery.addEqualsExpression(MANDATOR_UUID, mandator); >> >> But your schema is saying it is a bytes type... >> >> >> >> column_metadata=[{column_name: 00000000-0000-1000-0000-000000000000, >> validation_class: BytesType, index_name: mandatorIndex, index_type: KEYS}, >> {column_name: 00000001-0000-1000-0000-000000000000, validation_class: >> BytesType, index_name: useridIndex, index_type: KEYS}];"On 15 Mar 2011, at >> 22:41, >> >> >> >> Once I have the patch can you apply it and run your test again ? >> >> >> >> You may also want to ask on the Hector list if it automagically check you >> are using the correct types when creating an IndexedSlicesQuery. >> >> >> >> Aaron >> >> >> >> Roland Gude wrote: >> >> >> >> Forgot to attach the source code… here it comes >> >> >> >> Von: Roland Gude [mailto:roland.g...@yoochoose.com] >> Gesendet: Dienstag, 15. März 2011 10:39 >> An: user@cassandra.apache.org >> Betreff: AW: problems while TimeUUIDType-index-querying with two expressions >> >> >> >> Actually its not the column values that should be UUIDs in our case, but the >> column keys. The CF uses TimeUUID ordering and the values are just some >> ByteArrays. Even with changing the code to use UUIDSerializer instead of >> serializing the UUIDs manually the issue still exists. >> >> >> >> As far as I can see, there is nothing wrong with the IndexExpression. >> >> using two Index expressions with key=TimedUUID and Value=anything does not >> work >> >> using one index expression (any one of the other two) alone does work fine. >> >> >> >> I refactored Johannes code into a junit testcase. It needs the cluster >> configured as described in Johannes mail. >> >> There are three cases. Two with one of the indexExpressions and one with >> both index expression. The one with Both IndexExpression will never finish >> and youz will see the exception in the Cassandra logs. >> >> >> >> Bye, >> >> roland >> >> >> >> Von: aaron morton [mailto:aa...@thelastpickle.com] >> Gesendet: Dienstag, 15. März 2011 07:54 >> An: user@cassandra.apache.org >> Cc: Juergen Link; Roland Gude; her...@datastax.com >> Betreff: Re: problems while TimeUUIDType-index-querying with two expressions >> >> >> >> Perfectly reasonable, >> created https://issues.apache.org/jira/browse/CASSANDRA-2328 >> >> >> >> Aaron >> >> On 15 Mar 2011, at 16:52, Jonathan Ellis wrote: >> >> >> >> Sounds like we should send an InvalidRequestException then. >> >> On Mon, Mar 14, 2011 at 8:06 PM, aaron morton <aa...@thelastpickle.com> >> wrote: >> >> It's failing to when comparing two TimeUUID values because on of them is not >> >> properly formatted. In this case it's comparing a stored value with the >> >> value passed in the get_indexed_slice() query expression. >> >> I'm going to assume it's the value passed for the expression. >> >> When you create the IndexedSlicesQuery this is incorrect >> >> IndexedSlicesQuery<String, byte[], byte[]> indexQuery = HFactory >> >> .createIndexedSlicesQuery(keyspace, >> >> stringSerializer, bytesSerializer, bytesSerializer); >> >> Use a UUIDSerializer for the last param and then pass the UUID you want to >> >> build the expressing. Rather than the string/byte thing you are passing >> >> Hope that helps. >> >> Aaron >> >> On 15 Mar 2011, at 04:17, Johannes Hoerle wrote: >> >> >> >> Hi all, >> >> >> >> in order to improve our queries, we started to use IndexedSliceQueries from >> >> the hector project (https://github.com/zznate/hector-examples). I followed >> >> the instructions for creating IndexedSlicesQuery with >> >> GetIndexedSlices.java. >> >> I created the corresponding CF with in a keyspace called “Keyspace1” ( >> >> “create keyspace Keyspace1;”) with: >> >> "create column family Indexed1 with column_type='Standard' and >> >> comparator='UTF8Type' and keys_cached=200000 and read_repair_chance=1.0 and >> >> rows_cached=20000 and column_metadata=[{column_name: birthdate, >> >> validation_class: LongType, index_name: dateIndex, index_type: >> >> KEYS},{column_name: birthmonth, validation_class: LongType, index_name: >> >> monthIndex, index_type: KEYS}];" >> >> and the example GetIndexedSlices.java worked fine. >> >> >> >> Output of CF Indexed1: >> >> --------------------------------------- >> >> [default@Keyspace1] list Indexed1; >> >> Using default limit of 100 >> >> ------------------- >> >> RowKey: fake_key_12 >> >> => (column=birthdate, value=1974, timestamp=1300110485826059) >> >> => (column=birthmonth, value=0, timestamp=1300110485826060) >> >> => (column=fake_column_0, value=66616b655f76616c75655f305f3132, >> >> timestamp=1300110485826056) >> >> => (column=fake_column_1, value=66616b655f76616c75655f315f3132, >> >> timestamp=1300110485826057) >> >> => (column=fake_column_2, value=66616b655f76616c75655f325f3132, >> >> timestamp=1300110485826058) >> >> ------------------- >> >> RowKey: fake_key_8 >> >> => (column=birthdate, value=1974, timestamp=1300110485826039) >> >> => (column=birthmonth, value=8, timestamp=1300110485826040) >> >> => (column=fake_column_0, value=66616b655f76616c75655f305f38, >> >> timestamp=1300110485826036) >> >> => (column=fake_column_1, value=66616b655f76616c75655f315f38, >> >> timestamp=1300110485826037) >> >> => (column=fake_column_2, value=66616b655f76616c75655f325f38, >> >> timestamp=1300110485826038) >> >> ------------------- >> >> .... >> >> >> >> >> >> Now to the problem: >> >> As we have another column format in our cluster (using TimeUUIDType as >> >> comparator in CF definition) I adapted the application to our schema on a >> >> cassandra-0.7.3 cluster. >> >> We use a manually defined UUID for a mandator id index >> >> (00000000-0000-1000-0000-000000000000) and another one for a userid index >> >> (00000001-0000-1000-0000-000000000000). It can be created with: >> >> "create column family ByUser with column_type='Standard' and >> >> comparator='TimeUUIDType' and keys_cached=200000 and read_repair_chance=1.0 >> >> and rows_cached=20000 and column_metadata=[{column_name: >> >> 00000000-0000-1000-0000-000000000000, validation_class: BytesType, >> >> index_name: mandatorIndex, index_type: KEYS}, {column_name: >> >> 00000001-0000-1000-0000-000000000000, validation_class: BytesType, >> >> index_name: useridIndex, index_type: KEYS}];" >> >> >> >> >> >> which looks in the cluster using cassandra-cli like this: >> >> >> >> [default@Keyspace1] describe keyspace; >> >> Keyspace: Keyspace1: >> >> Replication Strategy: org.apache.cassandra.locator.SimpleStrategy >> >> Replication Factor: 1 >> >> Column Families: >> >> ColumnFamily: ByUser >> >> Columns sorted by: org.apache.cassandra.db.marshal.TimeUUIDType >> >> Row cache size / save period: 20000.0/0 >> >> Key cache size / save period: 200000.0/14400 >> >> Memtable thresholds: 0.2953125/63/1440 >> >> GC grace seconds: 864000 >> >> Compaction min/max thresholds: 4/32 >> >> Read repair chance: 0.01 >> >> Built indexes: [ByUser.mandatorIndex, ByUser.useridIndex] >> >> Column Metadata: >> >> Column Name: 00000001-0000-1000-0000-000000000000 >> >> Validation Class: org.apache.cassandra.db.marshal.BytesType >> >> Index Name: useridIndex >> >> Index Type: KEYS >> >> Column Name: 00000000-0000-1000-0000-000000000000 >> >> Validation Class: org.apache.cassandra.db.marshal.BytesType >> >> Index Name: mandatorIndex >> >> Index Type: KEYS >> >> ColumnFamily: Indexed1 >> >> Columns sorted by: org.apache.cassandra.db.marshal.UTF8Type >> >> Row cache size / save period: 20000.0/0 >> >> Key cache size / save period: 200000.0/14400 >> >> Memtable thresholds: 0.2953125/63/1440 >> >> GC grace seconds: 864000 >> >> Compaction min/max thresholds: 4/32 >> >> Read repair chance: 0.01 >> >> Built indexes: [Indexed1.dateIndex, Indexed1.monthIndex] >> >> Column Metadata: >> >> Column Name: birthmonth (birthmonth) >> >> Validation Class: org.apache.cassandra.db.marshal.LongType >> >> Index Name: monthIndex >> >> Index Type: KEYS >> >> Column Name: birthdate (birthdate) >> >> Validation Class: org.apache.cassandra.db.marshal.LongType >> >> Index Name: dateIndex >> >> Index Type: KEYS >> >> [default@Keyspace1] list ByUser; >> >> Using default limit of 100 >> >> ------------------- >> >> RowKey: testMandator!!user01 >> >> => (column=00000000-0000-1000-0000-000000000000, >> >> value=746573744d616e6461746f72, timestamp=1300111213321000) >> >> => (column=00000001-0000-1000-0000-000000000000, value=757365723031, >> >> timestamp=1300111213322000) >> >> => (column=f064b480-495e-11e0-abc4-0024e89fa587, value=3135, >> >> timestamp=1300111213561000) >> >> >> >> 1 Row Returned. >> >> >> >> the values of the index colums 00000000-0000-1000-0000-000000000000 and >> >> 00000001-0000-1000-0000-000000000000 represent "testMandator" and and >> >> "user01" as bytes >> >> the third column is a randomly generated one with value "15" that are >> >> inserted in GetTimeUUIDIndexedSlices app. >> >> I attached both source codes, GetIndexedSlices and GetTimeUUIDIndexedSlices. >> >> Currently the second index expression for the userid index in >> >> GetTimeUUIDIndexedSlices.queryCf(...) method >> >> >> >> indexQuery.addEqualsExpression(asByteArray(MANDATOR_UUID), new >> >> StringSerializer().toBytes(mandator)); >> >> //indexQuery.addEqualsExpression(asByteArray(USERID_INDEX_UUID), new >> >> StringSerializer().toBytes(dummyUserId)); >> >> >> >> is commented out, so the GetTimeUUIDIndexedSlices will run. Using one >> >> IndexQuery works perfectly fine but as soon as I add a second eq, gt, gte or >> >> lt expression I get an IndexOutOfBoundsException (see below). >> >> >> >> This issue can be easily reproduced by >> >> - downloading the zznate example >> >> (https://github.com/zznate/hector-examples), >> >> - mavenizing it to an eclipse project with "mvn clean eclipse:eclipse", >> >> - importing it in eclipse and >> >> - letting it run against a locally running cassandra instance (v0.7.3) which >> >> has the default settings (no changes in the .yaml) >> >> >> >> I hope that someone can help me with this issue ... after a couple of days >> >> it's driving me bonkers. >> >> >> >> Thx in advance, >> >> Johannes >> >> >> >> >> >> Exception: >> >> ERROR 14:47:56,842 Error in ThreadPoolExecutor >> >> java.lang.RuntimeException: java.lang.IndexOutOfBoundsException: 6 >> >> at >> >> org.apache.cassandra.service.IndexScanVerbHandler.doVerb(IndexScanVer >> >> bHandler.java:51) >> >> at >> >> org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask. >> >> java:72) >> >> at >> >> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExec >> >> utor.java:886) >> >> at >> >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor >> >> .java:908) >> >> at java.lang.Thread.run(Thread.java:619) >> >> Caused by: java.lang.IndexOutOfBoundsException: 6 >> >> at java.nio.HeapByteBuffer.get(HeapByteBuffer.java:121) >> >> at >> >> org.apache.cassandra.db.marshal.TimeUUIDType.compareTimestampBytes(Ti >> >> meUUIDType.java:56) >> >> at >> >> org.apache.cassandra.db.marshal.TimeUUIDType.compare(TimeUUIDType.jav >> >> a:45) >> >> at >> >> org.apache.cassandra.db.marshal.TimeUUIDType.compare(TimeUUIDType.jav >> >> a:29) >> >> at >> >> org.apache.cassandra.db.ColumnFamilyStore.satisfies(ColumnFamilyStore >> >> .java:1608) >> >> at >> >> org.apache.cassandra.db.ColumnFamilyStore.scan(ColumnFamilyStore.java >> >> :1552) >> >> at >> >> org.apache.cassandra.service.IndexScanVerbHandler.doVerb(IndexScanVer >> >> bHandler.java:42) >> >> ... 4 more >> >> ERROR 14:47:56,852 Fatal exception in thread Thread[ReadStage:14,5,main] >> >> java.lang.RuntimeException: java.lang.IndexOutOfBoundsException: 6 >> >> at >> >> org.apache.cassandra.service.IndexScanVerbHandler.doVerb(IndexScanVer >> >> bHandler.java:51) >> >> at >> >> org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask. >> >> java:72) >> >> at >> >> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExec >> >> utor.java:886) >> >> at >> >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor >> >> .java:908) >> >> <GetIndexedSlices.java><GetTimeUUIDIndexedSlices.java> >> >> >> >> >> -- >> Jonathan Ellis >> Project Chair, Apache Cassandra >> co-founder of DataStax, the source for professional Cassandra support >> http://www.datastax.com >> >> >> >> <GetTimeUUIDIndexedSlices.java> >> >> >> >> > > > > -- > Jonathan Ellis > Project Chair, Apache Cassandra > co-founder of DataStax, the source for professional Cassandra support > http://www.datastax.com