Kenneth is right. Trying to port/support a relational model to a CQL model the 
way you are doing it is not going to go well. You won’t be able to scale or get 
the search flexibility that you want. It will make Cassandra seem like a bad 
fit. You want to play to Cassandra’s strengths – availability, low latency, 
scalability, etc. so you need to store the data the way you want to retrieve it 
(query first modeling!). You could look at defining the “right” partition and 
clustering keys, so that the searches are within a single, reasonably sized 
partition. And you could have lookup tables for other common search patterns 
(item_by_model_name, etc.)

If that kind of modeling gets you to a situation where you have too many lookup 
tables to keep consistent, you could consider something like DataStax 
Enterprise Search (embedded SOLR) to create SOLR indexes on searchable fields. 
A SOLR query will typically be an order of magnitude slower than a partition 
key lookup, though.

It really boils down to the purpose of the data store. If you are looking for 
primarily an “anything goes” search engine, Cassandra may not be a good choice. 
If you need Cassandra-level availability, extremely low latency queries (on 
known access patterns), high volume/low latency writes, easy scalability, etc. 
then you are going to have to rethink how you model the data.


Sean Durity

From: Kenneth Brotman <kenbrot...@yahoo.com.INVALID>
Sent: Thursday, February 07, 2019 7:01 AM
To: user@cassandra.apache.org
Subject: [EXTERNAL] RE: SASI queries- cqlsh vs java driver

Peter,

Sounds like you may need to use a different architecture.  Perhaps you need 
something like Presto or Kafka as a part of the solution.  If the data from the 
legacy system is wrong for Cassandra it’s an ETL problem?  You’d have to 
transform the data you want to use with Cassandra so that a proper data model 
for Cassandra can be used.

From: Peter Heitman [mailto:pe...@heitman.us]
Sent: Wednesday, February 06, 2019 10:05 PM
To: user@cassandra.apache.org<mailto:user@cassandra.apache.org>
Subject: Re: SASI queries- cqlsh vs java driver

Yes, I have read the material. The problem is that the application has a query 
facility available to the user where they can type in "(A = foo AND B = bar) OR 
C = chex" where A, B, and C are from a defined list of terms, many of which are 
columns in the mytable below while others are from other tables. This query 
facility was implemented and shipped years before we decided to move to 
Cassandra
On Thu, Feb 7, 2019, 8:21 AM Kenneth Brotman 
<kenbrot...@yahoo.com.invalid<mailto:kenbrot...@yahoo.com.invalid>> wrote:
The problem is you’re not using a query first design.  I would recommend first 
reading chapter 5 of Cassandra: The Definitive Guide by Jeff Carpenter and Eben 
Hewitt.  It’s available free online at this 
link<https://urldefense.proofpoint.com/v2/url?u=https-3A__books.google.com_books-3Fid-3DuW-2DPDAAAQBAJ-26pg-3DPA79-26lpg-3DPA79-26dq-3Djeff-2Bcarpenter-2Bchapter-2B5-26source-3Dbl-26ots-3D58bUYyNM-2DJ-26sig-3DACfU3U22U58-2DQPlz6kzo0zziNF-2DbP30l4Q-26hl-3Den-26sa-3DX-26ved-3D2ahUKEwi0n-2DnWzajgAhXnHzQIHf6jBJIQ6AEwAXoECAgQAQ-23v-3Donepage-26q-3Djeff-2520carpenter-2520chapter-25205-26f-3Dfalse&d=DwMFaQ&c=MtgQEAMQGqekjTjiAhkudQ&r=aC_gxC6z_4f9GLlbWiKzHm1vucZTtVYWDDvyLkh8IaQ&m=dsY_P-wGUZe0KuIuE01HDz4w9EI5AH4457c9uWyQx5g&s=C6imJ8BRMoV5A9NzORjdrEq6B77ZSAEO9dP__FAXUz8&e=>.

Kenneth Brotman

From: Peter Heitman [mailto:pe...@heitman.us<mailto:pe...@heitman.us>]
Sent: Wednesday, February 06, 2019 6:33 PM

To: user@cassandra.apache.org<mailto:user@cassandra.apache.org>
Subject: Re: SASI queries- cqlsh vs java driver

Yes, I "know" that allow filtering is a sign of a (possibly fatal) inefficient 
data model. I haven't figured out how to do it correctly yet
On Thu, Feb 7, 2019, 7:59 AM Kenneth Brotman 
<kenbrot...@yahoo.com.invalid<mailto:kenbrot...@yahoo.com.invalid>> wrote:
Exactly.  When you design your data model correctly you shouldn’t have to use 
ALLOW FILTERING in the queries.  That is not recommended.

Kenneth Brotman

From: Peter Heitman [mailto:pe...@heitman.us<mailto:pe...@heitman.us>]
Sent: Wednesday, February 06, 2019 6:09 PM
To: user@cassandra.apache.org<mailto:user@cassandra.apache.org>
Subject: Re: SASI queries- cqlsh vs java driver

You are completely right! My problem is that I am trying to port code for SQL 
to CQL for an application that provides the user with a relatively general 
search facility. The original implementation didn't worry about secondary 
indexes - it just took advantage of the ability to create arbitrarily complex 
queries with inner joins, left joins, etc. I am reimplimenting it to create a 
parse tree of CQL queries and doing the ANDs and ORs in the application. Of 
course once I get enough of this implemented I will have to load up the table 
with a large data set and see if it gives acceptable performance for our use 
case.
On Wed, Feb 6, 2019, 8:52 PM Kenneth Brotman 
<kenbrotman@yahoo.cominvalid<mailto:kenbrotman@yahoo.cominvalid>> wrote:
Isn’t that a lot of SASI indexes for one table.  Could you denormalize more to 
reduce both columns per table and SASI indexes per table?  Eight SASI indexes 
on one table seems like a lot.

Kenneth Brotman

From: Peter Heitman [mailto:pe...@heitman.us<mailto:pe...@heitman.us>]
Sent: Tuesday, February 05, 2019 6:59 PM
To: user@cassandra.apache.org<mailto:user@cassandra.apache.org>
Subject: Re: SASI queries- cqlsh vs java driver

The table and secondary indexes look generally like this Note that I have 
changed the names of many of the columns to be generic since they aren't 
important to the question as far as I know. I left the actual names for those 
columns that I've created SASI indexes for. The query I use to try to create a 
PreparedStatement is:

SELECT sql_id, type, cpe_id, serial, product_class, manufacturer, sw_version 
FROM mytable WHERE serial IN :v0 LIMIT :limit0 ALLOW FILTERING

the schema cql statements are:

CREATE TABLE IF NOT EXISTS mykeyspace.mytable (
  id text,
  sql_id bigint,
  cpe_id text,
  sw_version text,
  hw_version text,
  manufacturer text,
  product_class text,
  manufacturer_oui text,
  description text,
  periodic_inform_interval text,
  restricted_mode_enabled text,
  restricted_mode_reason text,
  type text,
  model_name text,
  serial text,
  mac text,
   text,
  generic0 timestamp,
  household_id text,
  generic1 int,
  generic2 text,
  generic3 text,
  generic4 int,
  generic5 int,
  generic6 text,
  generic7 text,
  generic8 text,
  generic9 text,
  generic10 text,
  generic11 timestamp,
  generic12 text,
  generic13 text,
  generic14 timestamp,
  generic15 text,
  generic16 text,
  generic17 text,
  generic18 text,
  generic19 text,
  generic20 text,
  generic21 text,
  generic22 text,
  generic23 text,
  generic24 text,
  generic25 text,
  generic26 text,
  generic27 text,
  generic28 int,
  generic29 int,
  generic30 text,
  generic31 text,
  generic32 text,
  generic33 text,
  generic34 text,
  generic35 int,
  generic36 int,
  generic37 int,
  generic38 int,
  generic39 text,
  generic40 text,
  generic41 text,
  generic42 text,
  generic43 text,
  generic44 text,
  generic45 text,
  PRIMARY KEY (id)
);

CREATE INDEX IF NOT EXISTS bv_sql_id_idx ON mykeyspace.mytable (sql_id);

CREATE CUSTOM INDEX IF NOT EXISTS bv_serial_idx ON mykeyspace.mytable (serial)
       USING 'org.apache.cassandra.index.sasi.SASIIndex'
       WITH OPTIONS = {'mode': 'CONTAINS', 'analyzer_class': 
'org.apache.cassandra.index.sasi.analyzer.NonTokenizingAnalyzer', 
'case_sensitive': 'false'};

CREATE CUSTOM INDEX IF NOT EXISTS bv_cpe_id_idx ON mykeyspace.mytable (cpe_id)
       USING 'org.apache.cassandra.index.sasi.SASIIndex'
       WITH OPTIONS = {'mode': 'CONTAINS', 'analyzer_class': 
'org.apache.cassandra.index.sasi.analyzer.NonTokenizingAnalyzer', 
'case_sensitive': 'false'};

CREATE CUSTOM INDEX IF NOT EXISTS bv_mac_idx ON mykeyspace.mytable (mac)
       USING 'org.apache.cassandra.index.sasi.SASIIndex'
       WITH OPTIONS = {'mode': 'CONTAINS', 'analyzer_class': 
'orgapache.cassandra.index.sasi.analyzer.NonTokenizingAnalyzer', 
'case_sensitive': 'false'};

CREATE CUSTOM INDEX IF NOT EXISTS bv_manufacturer_idx ON mykeyspace.mytable 
(manufacturer)
       USING 'org.apache.cassandra.index.sasi.SASIIndex'
       WITH OPTIONS = {'mode': 'CONTAINS', 'analyzer_class': 
'org.apache.cassandra.index.sasi.analyzer.NonTokenizingAnalyzer', 
'case_sensitive': 'false'};

CREATE CUSTOM INDEX IF NOT EXISTS bv_manufacturer_oui_idx ON mykeyspace.mytable 
(manufacturer_oui)
       USING 'org.apache.cassandra.index.sasiSASIIndex'
       WITH OPTIONS = {'mode': 'CONTAINS', 'analyzer_class': 
'org.apache.cassandra.index.sasi.analyzer.NonTokenizingAnalyzer', 
'case_sensitive': 'false'};

CREATE CUSTOM INDEX IF NOT EXISTS bv_hw_version_idx ON mykeyspace.mytable 
(hw_version)
       USING 'org.apache.cassandra.index.sasi.SASIIndex'
       WITH OPTIONS = {'mode': 'CONTAINS', 'analyzer_class': 
'org.apache.cassandra.index.sasi.analyzer.NonTokenizingAnalyzer', 
'case_sensitive': 'false'};

CREATE CUSTOM INDEX IF NOT EXISTS bv_sw_version_idx ON mykeyspace.mytable 
(sw_version)
       USING 'org.apache.cassandra.index.sasi.SASIIndex'
       WITH OPTIONS = {'mode': 'CONTAINS', 'analyzer_class': 
'org.apache.cassandra.index.sasi.analyzer.NonTokenizingAnalyzer', 
'case_sensitive': 'false'};

CREATE CUSTOM INDEX IF NOT EXISTS bv_household_id_idx ON mykeyspace.mytable 
(household_id)
       USING 'org.apache.cassandra.index.sasi.SASIIndex'
       WITH OPTIONS = {'mode': 'CONTAINS', 'analyzer_class': 
'org.apache.cassandra.indexsasi.analyzer.NonTokenizingAnalyzer', 
'case_sensitive': 'false'};


On Tue, Feb 5, 2019 at 3:33 PM Oleksandr Petrov 
<oleksandr.pet...@gmail.com<mailto:oleksandr.pet...@gmail.com>> wrote:
Could you post full table schema (names obfuscated, if required) with index 
creation statements and queries?

On Mon, Feb 4, 2019 at 10:04 AM Jacques-Henri Berthemet 
<jacques-henri.berthe...@genesys.com<mailto:jacques-henri.berthe...@genesys.com>>
 wrote:
I’m not sure why it`s not allowed by the Datastax driver, but maybe you could 
try to use OR instead of IN?
SELECT blah FROM foo WHERE <indexed column> = :val1 OR <indexed column> = :val2 
ALLOW FILTERING

It should be the same as IN query, but I don’t if it makes a difference for 
performance.

From: Peter Heitman <pe...@heitman.us<mailto:pe...@heitman.us>>
Reply-To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" 
<user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
Date: Monday 4 February 2019 at 07:17
To: "user@cassandra.apache.org<mailto:u...@cassandraapache.org>" 
<user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
Subject: SASI queries- cqlsh vs java driver

When I create a SASI index on a secondary column, from cqlsh I can execute a 
query

SELECT blah FROM foo WHERE <indexed column> IN ('mytext') ALLOW FILTERING;

but not from the java driver:

SELECT blah FROM foo WHERE <indexed column> IN :val ALLOW FILTERING

Here I get an exception

com.datastax.driver.coreexceptions.InvalidQueryException: IN predicates on 
non-primary-key columns (<indexed column>) is not yet supported
at 
com.datastax.driver.coreexceptions.InvalidQueryException.copy(InvalidQueryException.java:49)
 ~[cassandra-driver-core-3.6.0.jar:na]

Why are they different? Is there anything I can do with the java driver to get 
past this exception?

Peter




--
alex p

________________________________

The information in this Internet Email is confidential and may be legally 
privileged. It is intended solely for the addressee. Access to this Email by 
anyone else is unauthorized. If you are not the intended recipient, any 
disclosure, copying, distribution or any action taken or omitted to be taken in 
reliance on it, is prohibited and may be unlawful. When addressed to our 
clients any opinions or advice contained in this Email are subject to the terms 
and conditions expressed in any applicable governing The Home Depot terms of 
business or client engagement letter. The Home Depot disclaims all 
responsibility and liability for the accuracy and content of this attachment 
and for any damages or losses arising from any inaccuracies, errors, viruses, 
e.g., worms, trojan horses, etc., or other items of a destructive nature, which 
may be contained in this attachment and shall not be liable for direct, 
indirect, consequential or special damages in connection with this e-mail 
message or its attachment.

Reply via email to