I think you wanted to use Int32Type instead of IntegerType for creating the indexes. IntegerType is actually representative of java.math.BigInteger.
On Tue, Nov 8, 2011 at 12:28 PM, Nate Sammons <nsamm...@ften.com> wrote: > Interesting… if I switch the columns to be UTF8 instead of integers, like > this: > > > > create column family IndexTest with > > key_validation_class = UTF8Type > > and comparator = UTF8Type > > and column_metadata = [ > > {column_name:year, validation_class:UTF8Type, index_type: KEYS}, > > {column_name:month, validation_class:UTF8Type, index_type: KEYS}, > > {column_name:day, validation_class:UTF8Type, index_type: KEYS}, > > {column_name:hour, validation_class:UTF8Type, index_type: KEYS}, > > {column_name:minute, validation_class:UTF8Type, index_type: KEYS}, > > {column_name:data, validation_class:UTF8Type} > > ]; > > > > > > And change the hector code to use setString(…) instead of setInteger(…). > > > > Then everything works fine. Is there a CQL bug with respect to non-string > columns? > > > > > > Thanks, > > > > -nate > > > > > > > > From: Nate Sammons [mailto:nsamm...@ften.com] > Sent: Tuesday, November 08, 2011 11:14 AM > > To: user@cassandra.apache.org > Subject: RE: Secondary index issue, unable to query for records that should > be there > > > > Note that I had identical behavior using a fresh download of Cassandra 1.0.2 > as of today. > > > > Thanks, > > > > -nate > > > > > > From: Nate Sammons [mailto:nsamm...@ften.com] > Sent: Tuesday, November 08, 2011 10:20 AM > To: user@cassandra.apache.org > Subject: RE: Secondary index issue, unable to query for records that should > be there > > > > I restarted with logging turned up to DEBUG, and after quite a bit of > logging during startup, I re-ran a query: > > > > > > get IndexTest where year=2011 and month=1 and day=14 and hour=18 and > minute=49; > > > > > > produced the following in the following: > > > > DEBUG [pool-2-thread-3] 2011-11-08 10:19:21,823 CassandraServer.java (line > 728) scan > > DEBUG [pool-2-thread-3] 2011-11-08 10:19:21,824 StorageProxy.java (line > 1017) restricted ranges for query [-1,-1] are > [[-1,160425280223280959086247334056682279392], > (160425280223280959086247334056682279392,-1]] > > DEBUG [pool-2-thread-3] 2011-11-08 10:19:21,824 StorageProxy.java (line > 1104) scan ranges are > [-1,160425280223280959086247334056682279392],(160425280223280959086247334056682279392,-1] > > DEBUG [pool-2-thread-3] 2011-11-08 10:19:21,825 ReadCallback.java (line 77) > Blockfor/repair is 1/false; setting up requests to natebookpro/127.0.1.1 > > DEBUG [pool-2-thread-3] 2011-11-08 10:19:21,826 StorageProxy.java (line > 1131) reading org.apache.cassandra.db.IndexScanCommand@7bc203c from > natebookpro/127.0.1.1 > > DEBUG [ReadStage:46] 2011-11-08 10:19:21,827 KeysSearcher.java (line 96) > Primary scan clause is minute > > DEBUG [ReadStage:46] 2011-11-08 10:19:21,827 KeysSearcher.java (line 109) > Expanding slice filter to entire row to cover additional expressions > > DEBUG [ReadStage:46] 2011-11-08 10:19:21,827 KeysSearcher.java (line 151) > Scanning index 'IndexTest.minute EQ 49' starting with > > DEBUG [ReadStage:46] 2011-11-08 10:19:21,827 CollationController.java (line > 189) collectAllData > > DEBUG [ReadStage:46] 2011-11-08 10:19:21,828 KeysSearcher.java (line 163) > fetched null > > DEBUG [ReadStage:46] 2011-11-08 10:19:21,828 IndexScanVerbHandler.java (line > 46) Sending RangeSliceReply{rows=} to 808@natebookpro/127.0.1.1 > > DEBUG [RequestResponseStage:21] 2011-11-08 10:19:21,829 > ResponseVerbHandler.java (line 44) Processing response on a callback from > 808@natebookpro/127.0.1.1 > > DEBUG [pool-2-thread-3] 2011-11-08 10:19:21,829 ReadCallback.java (line 77) > Blockfor/repair is 1/false; setting up requests to natebookpro/127.0.1.1 > > DEBUG [pool-2-thread-3] 2011-11-08 10:19:21,830 StorageProxy.java (line > 1131) reading org.apache.cassandra.db.IndexScanCommand@6a25a21d from > natebookpro/127.0.1.1 > > DEBUG [ReadStage:47] 2011-11-08 10:19:21,831 KeysSearcher.java (line 96) > Primary scan clause is minute > > DEBUG [ReadStage:47] 2011-11-08 10:19:21,832 KeysSearcher.java (line 109) > Expanding slice filter to entire row to cover additional expressions > > DEBUG [ReadStage:47] 2011-11-08 10:19:21,832 KeysSearcher.java (line 151) > Scanning index 'IndexTest.minute EQ 49' starting with > > DEBUG [ReadStage:47] 2011-11-08 10:19:21,832 CollationController.java (line > 189) collectAllData > > DEBUG [ReadStage:47] 2011-11-08 10:19:21,832 KeysSearcher.java (line 163) > fetched null > > DEBUG [ReadStage:47] 2011-11-08 10:19:21,833 IndexScanVerbHandler.java (line > 46) Sending RangeSliceReply{rows=} to 809@natebookpro/127.0.1.1 > > DEBUG [RequestResponseStage:22] 2011-11-08 10:19:21,834 > ResponseVerbHandler.java (line 44) Processing response on a callback from > 809@natebookpro/127.0.1.1 > > > > > > > > Whereas a direct read of a key using “get IndexTest[2011-1-14-18-49--1];” > produced a result, and the following in the logs: > > > > DEBUG [pool-2-thread-2] 2011-11-08 10:11:20,153 CassandraServer.java (line > 323) get_slice > > DEBUG [pool-2-thread-2] 2011-11-08 10:11:20,159 StorageProxy.java (line 623) > Command/ConsistencyLevel is SliceFromReadCommand(table='Test', > key='323031312d312d31342d31382d34392d2d31', > column_parent='QueryPath(columnFamilyName='IndexTest', > superColumnName='null', columnName='null')', start='', finish='', > reversed=false, count=1000000)/ONE > > DEBUG [pool-2-thread-2] 2011-11-08 10:11:20,159 ReadCallback.java (line 77) > Blockfor/repair is 1/true; setting up requests to natebookpro/127.0.1.1 > > DEBUG [pool-2-thread-2] 2011-11-08 10:11:20,159 StorageProxy.java (line 639) > reading data locally > > DEBUG [ReadStage:37] 2011-11-08 10:11:20,160 StorageProxy.java (line 792) > LocalReadRunnable reading SliceFromReadCommand(table='Test', > key='323031312d312d31342d31382d34392d2d31', > column_parent='QueryPath(columnFamilyName='IndexTest', > superColumnName='null', columnName='null')', start='', finish='', > reversed=false, count=1000000) > > DEBUG [ReadStage:37] 2011-11-08 10:11:20,161 CollationController.java (line > 189) collectAllData > > DEBUG [ReadStage:37] 2011-11-08 10:11:20,162 SliceQueryFilter.java (line > 123) collecting 0 of 1000000: data:false:512@1320769510502017 > > DEBUG [ReadStage:37] 2011-11-08 10:11:20,163 SliceQueryFilter.java (line > 123) collecting 1 of 1000000: day:false:4@1320769510502014 > > DEBUG [ReadStage:37] 2011-11-08 10:11:20,163 SliceQueryFilter.java (line > 123) collecting 2 of 1000000: hour:false:4@1320769510502015 > > DEBUG [ReadStage:37] 2011-11-08 10:11:20,163 SliceQueryFilter.java (line > 123) collecting 3 of 1000000: minute:false:4@1320769510502016 > > DEBUG [ReadStage:37] 2011-11-08 10:11:20,163 SliceQueryFilter.java (line > 123) collecting 4 of 1000000: month:false:4@1320769510502013 > > DEBUG [ReadStage:37] 2011-11-08 10:11:20,163 SliceQueryFilter.java (line > 123) collecting 5 of 1000000: year:false:4@1320769510502012 > > DEBUG [pool-2-thread-2] 2011-11-08 10:11:20,170 StorageProxy.java (line 689) > Read: 10 ms. > > > > > > > > Note that a query for “get IndexTest where minute=49” (which also returns no > records) results in the following logs: > > > > > > DEBUG [pool-2-thread-2] 2011-11-08 10:13:40,210 CassandraServer.java (line > 728) scan > > DEBUG [pool-2-thread-2] 2011-11-08 10:13:40,211 StorageProxy.java (line > 1017) restricted ranges for query [-1,-1] are > [[-1,160425280223280959086247334056682279392], > (160425280223280959086247334056682279392,-1]] > > DEBUG [pool-2-thread-2] 2011-11-08 10:13:40,211 StorageProxy.java (line > 1104) scan ranges are > [-1,160425280223280959086247334056682279392],(160425280223280959086247334056682279392,-1] > > DEBUG [pool-2-thread-2] 2011-11-08 10:13:40,212 ReadCallback.java (line 77) > Blockfor/repair is 1/false; setting up requests to natebookpro/127.0.1.1 > > DEBUG [pool-2-thread-2] 2011-11-08 10:13:40,212 StorageProxy.java (line > 1131) reading org.apache.cassandra.db.IndexScanCommand@547d6c11 from > natebookpro/127.0.1.1 > > DEBUG [ReadStage:42] 2011-11-08 10:13:40,213 KeysSearcher.java (line 96) > Primary scan clause is minute > > DEBUG [ReadStage:42] 2011-11-08 10:13:40,213 KeysSearcher.java (line 109) > Expanding slice filter to entire row to cover additional expressions > > DEBUG [ReadStage:42] 2011-11-08 10:13:40,214 KeysSearcher.java (line 151) > Scanning index 'IndexTest.minute EQ 49' starting with > > DEBUG [ReadStage:42] 2011-11-08 10:13:40,214 CollationController.java (line > 189) collectAllData > > DEBUG [ReadStage:42] 2011-11-08 10:13:40,214 KeysSearcher.java (line 163) > fetched null > > DEBUG [ReadStage:42] 2011-11-08 10:13:40,214 IndexScanVerbHandler.java (line > 46) Sending RangeSliceReply{rows=} to 462@natebookpro/127.0.1.1 > > DEBUG [RequestResponseStage:17] 2011-11-08 10:13:40,215 > ResponseVerbHandler.java (line 44) Processing response on a callback from > 462@natebookpro/127.0.1.1 > > DEBUG [pool-2-thread-2] 2011-11-08 10:13:40,216 ReadCallback.java (line 77) > Blockfor/repair is 1/false; setting up requests to natebookpro/127.0.1.1 > > DEBUG [pool-2-thread-2] 2011-11-08 10:13:40,216 StorageProxy.java (line > 1131) reading org.apache.cassandra.db.IndexScanCommand@62132898 from > natebookpro/127.0.1.1 > > DEBUG [ReadStage:43] 2011-11-08 10:13:40,217 KeysSearcher.java (line 96) > Primary scan clause is minute > > DEBUG [ReadStage:43] 2011-11-08 10:13:40,217 KeysSearcher.java (line 109) > Expanding slice filter to entire row to cover additional expressions > > DEBUG [ReadStage:43] 2011-11-08 10:13:40,218 KeysSearcher.java (line 151) > Scanning index 'IndexTest.minute EQ 49' starting with > > DEBUG [ReadStage:43] 2011-11-08 10:13:40,218 CollationController.java (line > 189) collectAllData > > DEBUG [ReadStage:43] 2011-11-08 10:13:40,218 KeysSearcher.java (line 163) > fetched null > > DEBUG [ReadStage:43] 2011-11-08 10:13:40,218 IndexScanVerbHandler.java (line > 46) Sending RangeSliceReply{rows=} to 463@natebookpro/127.0.1.1 > > DEBUG [RequestResponseStage:18] 2011-11-08 10:13:40,219 > ResponseVerbHandler.java (line 44) Processing response on a callback from > 463@natebookpro/127.0.1.1 > > > > > > > > > > > > Thanks, > > > > -nate > > > > > > > > > > > > > > From: Jake Luciani [mailto:jak...@gmail.com] > Sent: Tuesday, November 08, 2011 8:56 AM > To: user@cassandra.apache.org > Subject: Re: Secondary index issue, unable to query for records that should > be there > > > > Hi Nate, > > > > Could you try running it with debug enabled on the logs? it will give more > insite into what's going on. > > > > -Jake > > > > On Tue, Nov 8, 2011 at 3:45 PM, Nate Sammons <nsamm...@ften.com> wrote: > > This is against a single server, not a cluster. Replication factor for the > keyspace is set to 1, CL is the default for Hector, which I think is QUORUM. > > > > I’m trying to get a simple test together that shows this. Does anyone know > if multiple indexes like this are efficient? > > > > Thanks, > > > > -nate > > > > > > From: Riyad Kalla [mailto:rka...@gmail.com] > Sent: Monday, November 07, 2011 4:31 PM > To: user@cassandra.apache.org > Subject: Re: Secondary index issue, unable to query for records that should > be there > > > > Nate, is this all against a single Cassandra server, or do you have a ring > setup? If you do have a ring setup, what is your replicationfactor set to? > Also what ConsistencyLevel are you writing with when storing the values? > > > > -R > > On Mon, Nov 7, 2011 at 2:43 PM, Nate Sammons <nsamm...@ften.com> wrote: > > Hello, > > > > I’m experimenting with Cassandra (DataStax Enterprise 1.0.3), and I’ve got a > CF with several secondary indexes to try out some options. Right now I have > the following to create my CF using the CLI: > > > > create column family MyTest with > > key_validation_class = UTF8Type > > and comparator = UTF8Type > > and column_metadata = [ > > -- absolute timestamp for this message, also indexed > year/month/day/hour/minute > > -- index these as they are low cardinality > > {column_name:messageTimestamp, validation_class:LongType}, > > {column_name:messageYear, validation_class:IntegerType, index_type: > KEYS}, > > {column_name:messageMonth, validation_class:IntegerType, index_type: > KEYS}, > > {column_name:messageDay, validation_class:IntegerType, index_type: > KEYS}, > > {column_name:messageHour, validation_class:IntegerType, index_type: > KEYS}, > > {column_name:messageMinute, validation_class:IntegerType, index_type: > KEYS}, > > > > … other non-indexed columns defined > > > > ]; > > > > > > So when I insert data, I calculate a year/month/day/hour/minute and set > these values on a Hector ColumnFamilyUpdater instance and update that way. > Then later I can query from the command line with CQL such as: > > > > get MyTest where messageYear=2011 and messageMonth=6 and > messageDay=1 and messageHour=13 and messageMinute=44; > > > > etc. This generally works, however at some point queries that I know should > return data no longer return any rows. > > > > So for instance, part way through my test (inserting 250K rows), I can query > for what should be there and get data back such as the above query, but > later that same query returns 0 rows. Similarly, with fewer clauses in the > expression, like this: > > > > get MyTest where messageYear=2011 and messageMonth=6; > > > > Will also return 0 rows. > > > > > > ??????? > > Any idea what could be going wrong? I’m not getting any exceptions in my > client during the write, and I don’t see anything in the logs (no errors > anyway). > > > > > > > > A second question – is what I’m doing insane? I’m not sure that performance > on CQL queries with multiple indexed columns is good (does Cassandra > intelligently use all available indexes on these queries?) > > > > > > > > Thanks, > > > > -nate > > > > > > -- > http://twitter.com/tjake