The same issue is appearing in CQL Shell as well. 1) Entered into cqlsh 2) SET CONSISTENCY QUORUM; 3) Ran a select * with partition key in where cluase.
First result gave 0 records, and Next records gave results. Its really freaking out us at the moment. And nothing in debug.log or system.log. Thank You, Regards, Srini On Fri, Mar 17, 2017 at 2:33 AM, daemeon reiydelle <daeme...@gmail.com> wrote: > The prep is needed. If I recall correctly it must remain in cache for the > query to complete. I don't have the docs to dig out the yaml parm to adjust > query cache. I had run into the problem stress testing a smallish cluster > with many queries at once. > > Do you have a sense of how many distinct queries are hitting the cluster > at peak? > > If many clients, how do you balance the connection load or do you always > hit the same node? > > > sent from my mobile > Daemeon Reiydelle > skype daemeon.c.m.reiydelle > USA 415.501.0198 > > On Mar 16, 2017 3:25 PM, "srinivasarao daruna" <sree.srin...@gmail.com> > wrote: > >> Hi reiydelle, >> >> I cannot confirm the range as the volume of data is huge and the query >> frequency is also high. >> If the cache is the cause of issue, can we increase cache size or is >> there solution to avoid dropped prep statements.? >> >> >> >> >> >> >> Thank You, >> Regards, >> Srini >> >> On Thu, Mar 16, 2017 at 2:13 PM, daemeon reiydelle <daeme...@gmail.com> >> wrote: >> >>> The discard due to oom is causing the zero returned. I would guess a >>> cache miss problem of some sort, but not sure. Are you using row, index, >>> etc. caches? Are you seeing the failed prep statement on random nodes (duh, >>> nodes that have the relevant data ranges)? >>> >>> >>> *.......* >>> >>> >>> >>> *Daemeon C.M. ReiydelleUSA (+1) 415.501.0198 <+1%20415-501-0198>London >>> (+44) (0) 20 8144 9872 <+44%2020%208144%209872>* >>> >>> On Thu, Mar 16, 2017 at 10:56 AM, Ryan Svihla <r...@foundev.pro> wrote: >>> >>>> Depends actually, restore just restores what's there, so if only one >>>> node had a copy of the data then only one node had a copy of the data >>>> meaning quorum will still be wrong sometimes. >>>> >>>> On Thu, Mar 16, 2017 at 1:53 PM, Arvydas Jonusonis < >>>> arvydas.jonuso...@gmail.com> wrote: >>>> >>>>> If the data was written at ONE, consistency is not guaranteed. ..but >>>>> considering you just restored the cluster, there's a good chance something >>>>> else is off. >>>>> >>>>> On Thu, Mar 16, 2017 at 18:19 srinivasarao daruna < >>>>> sree.srin...@gmail.com> wrote: >>>>> >>>>>> Want to make read and write QUORUM as well. >>>>>> >>>>>> >>>>>> On Mar 16, 2017 1:09 PM, "Ryan Svihla" <r...@foundev.pro> wrote: >>>>>> >>>>>> Replication factor is 3, and write consistency is ONE and >>>>>> read consistency is QUORUM. >>>>>> >>>>>> That combination is not gonna work well: >>>>>> >>>>>> *Write succeeds to NODE A but fails on node B,C* >>>>>> >>>>>> *Read goes to NODE B, C* >>>>>> >>>>>> If you can tolerate some temporary inaccuracy you can use QUORUM but >>>>>> may still have the situation where >>>>>> >>>>>> Write succeeds on node A a timestamp 1, B succeeds at timestamp 2 >>>>>> Read succeeds on node B and C at timestamp 1 >>>>>> >>>>>> If you need fully race condition free counts I'm afraid you need to >>>>>> use SERIAL or LOCAL_SERIAL (for in DC only accuracy) >>>>>> >>>>>> On Thu, Mar 16, 2017 at 1:04 PM, srinivasarao daruna < >>>>>> sree.srin...@gmail.com> wrote: >>>>>> >>>>>> Replication strategy is SimpleReplicationStrategy. >>>>>> >>>>>> Smith is : EC2 snitch. As we deployed cluster on EC2 instances. >>>>>> >>>>>> I was worried that CL=ALL have more read latency and read failures. >>>>>> But won't rule out trying it. >>>>>> >>>>>> Should I switch select count (*) to select partition_key column? >>>>>> Would that be of any help.? >>>>>> >>>>>> >>>>>> Thank you >>>>>> Regards >>>>>> Srini >>>>>> >>>>>> On Mar 16, 2017 12:46 PM, "Arvydas Jonusonis" < >>>>>> arvydas.jonuso...@gmail.com> wrote: >>>>>> >>>>>> What are your replication strategy and snitch settings? >>>>>> >>>>>> Have you tried doing a read at CL=ALL? If it's an actual >>>>>> inconsistency issue (missing data), this should cause the correct results >>>>>> to be returned. You'll need to run a repair to fix the inconsistencies. >>>>>> >>>>>> If all the data is actually there, you might have one or several >>>>>> nodes that aren't identifying the correct replicas. >>>>>> >>>>>> Arvydas >>>>>> >>>>>> >>>>>> >>>>>> On Thu, Mar 16, 2017 at 5:31 PM, srinivasarao daruna < >>>>>> sree.srin...@gmail.com> wrote: >>>>>> >>>>>> Hi Team, >>>>>> >>>>>> We are struggling with a problem related to cassandra counts, after >>>>>> backup and restore of the cluster. Aaron Morton has suggested to send >>>>>> this >>>>>> to user list, so some one of the list will be able to help me. >>>>>> >>>>>> We are have a rest api to talk to cassandra and one of our query >>>>>> which fetches count is creating problems for us. >>>>>> >>>>>> We have done backup and restore and copied all the data to new >>>>>> cluster. We have done nodetool refresh on the tables, and did the >>>>>> nodetool >>>>>> repair as well. >>>>>> >>>>>> However, one of our key API call is returning inconsistent results. >>>>>> The result count is 0 in the first call and giving the actual values for >>>>>> later calls. The query frequency is bit high and failure rate has also >>>>>> raised considerably. >>>>>> >>>>>> 1) The count query has partition keys in it. Didnt see any read >>>>>> timeout or any errors from api logs. >>>>>> >>>>>> 2) This is how our code of creating session looks. >>>>>> >>>>>> val poolingOptions = new PoolingOptions >>>>>> poolingOptions >>>>>> .setCoreConnectionsPerHost(HostDistance.LOCAL, 4) >>>>>> .setMaxConnectionsPerHost(HostDistance.LOCAL, 10) >>>>>> .setCoreConnectionsPerHost(HostDistance.REMOTE, 4) >>>>>> .setMaxConnectionsPerHost( HostDistance.REMOTE, 10) >>>>>> >>>>>> val builtCluster = clusterBuilder.withCredentials(username, password) >>>>>> .withPoolingOptions(poolingOptions) >>>>>> .build() >>>>>> val cassandraSession = builtCluster.get.connect() >>>>>> >>>>>> val preparedStatement = cassandraSession.prepare(state >>>>>> ment).setConsistencyLevel(ConsistencyLevel.QUORUM) >>>>>> cassandraSession.execute(preparedStatement.bind(args :_*)) >>>>>> >>>>>> Query: SELECT count(*) FROM table_name WHERE parition_column=? AND >>>>>> text_column_of_clustering_key=? AND date_column_of_clustering_key<=? >>>>>> AND date_column_of_clustering_key>=? >>>>>> >>>>>> 3) Cluster configuration: >>>>>> >>>>>> 6 Machines: 3 seeds, we are using apache cassandra 3.9 version. Each >>>>>> machine is equipped with 16 Cores and 64 GB Ram. >>>>>> >>>>>> Replication factor is 3, and write consistency is ONE and >>>>>> read consistency is QUORUM. >>>>>> >>>>>> 4) cassandra is never down on any machine >>>>>> >>>>>> 5) Using cassandra-driver-core artifact with 3.1.1 version in the api. >>>>>> >>>>>> 6) nodetool tpstats shows no read failures, and no other failures. >>>>>> >>>>>> 7) Do not see any other issues from system.log of cassandra. We just >>>>>> see few warnings as below. >>>>>> >>>>>> Maximum memory usage reached (512.000MiB), cannot allocate chunk of >>>>>> 1.000MiB >>>>>> WARN [ScheduledTasks:1] 2017-03-14 14:58:37,141 >>>>>> QueryProcessor.java:103 - 88 prepared statements discarded in the last >>>>>> minute because cache limit reached (32 MB) >>>>>> The first api call returns 0 and the api calls later gives right >>>>>> values. >>>>>> >>>>>> Please let me know, if any other details needed. >>>>>> Could you please have a look at this issue once and kindly give me >>>>>> your inputs? This issue literally broke the confidence on Cassandra from >>>>>> our business team. >>>>>> >>>>>> Your inputs will be really helpful. >>>>>> >>>>>> Thank You, >>>>>> Regards, >>>>>> Srini >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> >>>>>> Thanks, >>>>>> Ryan Svihla >>>>>> >>>>>> >>>> >>>> >>>> -- >>>> >>>> Thanks, >>>> Ryan Svihla >>>> >>>> >>> >>