On 2017-06-15 19:10 (-0700), srinivasarao daruna <sree.srin...@gmail.com> 
wrote: 
> Hi,
> 
> Recently one of our spark job had missed cassandra consistency property and
> number of concurrent writes property.

Just for the record, you still have a consistency level set, it's just set to 
whatever your driver/spark defaults to (probably LOCAL_ONE). This probably 
means it's firing writes faster than you'd expect (no backpressure), which may 
have contributed to your problems.


> 
> Due to that, some of mutations are failed when we checked tpstats. Also, we
> observed readtimeouts are occurring with not only the table that the job
> inserts, but also from other tables, for which have always had consistency
> level proper. We started repair, but due to the volume of data, repair
> might take a day or two to complete. Mean while, wanted to get some inputs.
> 
> As the error planted lot of questions.
> 1) Is there a relation between mutation fails to read time outs and overall
> cluster performance, if yes, how.?
> 

When the cluster is heavily loaded, you'll see both dropped mutation and read 
timeouts, yes. 

It's also true that reads can impact writes, and writes can impact reads - 
especially since it's all in one shared JVM process, with common garbage 
collecting.

> 2) When i checked the log, i found a warning in debug.log as below.
> SELECT * FROM our_table WHERE partition_key = required_value LIMIT 5000:
> total time 20353 msec - timeout 20000 msec
> 
> Actual query:
> SELECT * FROM our_table WHERE partition_key = required_value
> 
> Even though we are hitting partition key, i do not understand the reason
> for such huge read time and timeouts.

Likely related to JVM GC pauses. How big is that partition (nodetool cfstats 
may help here)? Are you seeing a lot of other GC pauses going on (you should 
have monitoring, or at least glance at the log for 'GCInspector' lines)? 

> 
> 3) We are using prepared statements to query the tables from API. How can
> we set the fetch size, so that it wont use LIMIT 5000.?
> Any thoughts.?
> 
> 

Driver dependent, but most of them offer this for prepared statements as well. 
The datastax java driver also offers it globally on the 
Cluster.builder().withQueryOptions(new QueryOptions().setFetchSize(100))
 



---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org

Reply via email to