Something like this. Actually I think it's better to extend get_indexed_slice() API instead of creating new one thrift method. I wish to have something like this:
//here we run query to external search engine List<byte[]> keys = performSphinxQuery(someFullTextSearchQuery); IndexClause indexClause = new IndexClause(); //required API to set list of keys indexClause.setKeys(keys); indexClause.setExpressions(someFilteringExpressions); List finalResult = get_indexed_slices(colParent, indexClause, colPredicate, cLevel); I can't solve my issue with single get_indexed_slice(). Here is issue in more details: 1) have ~ 6 millions records, in feature could be much more 2) have > 10k different properties (stored as column values in Cassandra), in feature could be much more 3) properties are text descriptions , int/float values, string values 4) need to implement search over all properties. For text descriptions: full text search. for int/float properties: range search. 5) Search query could use any combination of property descriptions. Like full text search description and some range expression for int/float field. 6) have external search engine (Sphinx) that indexed all string and text properties 7) still need to perform range search for int, float fields. So now I split my query expressions in 2 groups: 1) expressions that can be handled by search engine 2) others (additional filters) For example I run first query to Sphinx and got list of rowKeys, with length of 100k. (mark as RESULT1) Now I need to filter it by second group of expressions. For example I have simple expression: "age > 25". So imagine I would run get_indexed_slice() with this query and could possibly get half of my records in result. (mark as RESULT2) Then I would need to get intersection between RESULT1 and RESULT2 on client side, which could take a lot of time and memory. That is why I can't use single get_indexed_slice here. For me is better to iterate RESULT1 (with 100k records) at client side to filter by age and got 10-50k record as final result. Disadvantage here is that I have to fetch all 100k records. Evgeny.