Re: Fast Writes to Cassandra Failing Through Python Script

Jonathan Haddad Thu, 15 Mar 2018 10:22:46 -0700

Generally speaking, you don't need to.  I almost never do.  I've only set
it in situations where I've had a large number of tables and I want to
avoid a lot of flushing when commit log segments are removed.


Setting it to 128 milliseconds means it's flushing 8 times per second,
which gives no benefit, and only hurts things, as you've discovered.

On Thu, Mar 15, 2018 at 10:15 AM Affan Syed <as...@an10.io> wrote:

> No it did solve the problem, as Faraz mentioned but I am still not sure
> about whats the underlying cause. Is 0ms really correct? how do we setu a
> flush period?
>
> - Affan
>
> On Thu, Mar 15, 2018 at 10:00 PM, Jon Haddad <j...@jonhaddad.com> wrote:
>
>> TWCS does SizeTieredCompaction within the window, so it’s not likely to
>> make a difference.  I’m +1’ing what Jeff said,
>> 128ms memtable_flush_period_in_ms is almost certainly your problem, unless
>> you’ve changed other settings and haven’t told us about them.
>>
>>
>> On Mar 15, 2018, at 9:54 AM, Affan Syed <as...@an10.io> wrote:
>>
>> Jeff,
>>
>> I think additionally the reason might also be that the keyspace was using 
>> TimeWindowCompactionStrategy
>> with 1 day bucket; however the writes very quite rapid and no automatic
>> compaction was working.
>>
>> I would think changing strategy to SizeTiered would also solve this
>> problem?
>>
>>
>>
>> - Affan
>>
>> On Thu, Mar 15, 2018 at 12:11 AM, Jeff Jirsa <jji...@gmail.com> wrote:
>>
>>> The problem was likely more with the fact that it can’t flush in 128ms
>>> so you backup on flush
>>>
>>>
>>> --
>>> Jeff Jirsa
>>>
>>>
>>> On Mar 14, 2018, at 12:07 PM, Faraz Mateen <fmat...@an10.io> wrote:
>>>
>>> I was able to overcome the timeout error by setting
>>> memtable_flush_period_in_ms to 0 for all my tables. Initially it was set to
>>> 128.
>>> Now I able to write ~40000 records/min in cassandra and the script has
>>> been running for around 12 hours now.
>>>
>>> However, I am still curious that why was cassandra unable to hold data
>>> in the memory for 128 ms considering that I have 30 GB of RAM for each node.
>>>
>>> On Wed, Mar 14, 2018 at 2:24 PM, Faraz Mateen <fmat...@an10.io> wrote:
>>>
>>>> Thanks for the response.
>>>>
>>>> Here is the output of "DESCRIBE" on my table
>>>>
>>>> https://gist.github.com/farazmateen/1c88f6ae4fb0b9f1619a2a1b28ae58c4
>>>>
>>>> I am getting two errors from the python script that I mentioned above.
>>>> First one does not show any error or exception in server logs. Second 
>>>> error:
>>>>
>>>> *"cassandra.OperationTimedOut: errors={'10.128.1.1': 'Client request
>>>> timeout. See Session.execute[_async](timeout)'}, last_host=10.128.1.1"*
>>>>
>>>> shows JAVA HEAP Exception in server logs. You can look at the exception
>>>> here:
>>>>
>>>>
>>>> https://gist.githubusercontent.com/farazmateen/e7aa5749f963ad2293f8be0ca1ccdc22/raw/e3fd274af32c20eb9f534849a31734dcd33745b4/JVM-HEAP-EXCEPTION.txt
>>>>
>>>> My python code snippet can be viewed at the following link:
>>>> https://gist.github.com/farazmateen/02be8bb59cdb205d6a35e8e3f93e27d5
>>>>
>>>> <https://gist.github.com/farazmateen/02be8bb59cdb205d6a35e8e3f93e27d5>
>>>> H
>>>> <https://gist.github.com/farazmateen/02be8bb59cdb205d6a35e8e3f93e27d5>ere
>>>> are timeout related arguments from (*/etc/cassandra/cassandra.yaml*)
>>>>
>>>> read_request_timeout_in_ms: 5000
>>>> range_request_timeout_in_ms: 10000
>>>> write_request_timeout_in_ms: 10000
>>>> counter_write_request_timeout_in_ms: 5000
>>>> cas_contention_timeout_in_ms: 1000
>>>> truncate_request_timeout_in_ms: 60000
>>>> request_timeout_in_ms: 10000
>>>> cross_node_timeout: false
>>>>
>>>>
>>>> On Wed, Mar 14, 2018 at 4:22 AM, Bruce Tietjen <
>>>> bruce.tiet...@imatsolutions.com> wrote:
>>>>
>>>>> The following won't address any server performance issues, but will
>>>>> allow your application to continue to run even if there are client or
>>>>> server timeouts:
>>>>>
>>>>>     Your python code should wrap all Cassandra statement execution
>>>>> calls in a try/except block to catch any errors and handle them
>>>>> appropriately.
>>>>>     For timeouts, you might consider re-trying the statement.
>>>>>
>>>>>     You may also want to consider proactively setting your client
>>>>> and/or server timeouts so your application sees fewer failures.
>>>>>
>>>>>
>>>>> Any production code should include proper error handling and during
>>>>> initial development and testing, it may be helpful to allow your
>>>>> application to continue running
>>>>> so you get a better idea of if or when different timeouts occur.
>>>>>
>>>>> see:
>>>>>    cassandra.Timeout
>>>>>    cassandra.WriteTimeout
>>>>>    cassandra.ReadTimeout
>>>>>
>>>>> also:
>>>>>    https://datastax.github.io/python-driver/api/cassandra.html
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Tue, Mar 13, 2018 at 5:17 PM, Goutham reddy <
>>>>> goutham.chiru...@gmail.com> wrote:
>>>>>
>>>>>> Faraz,
>>>>>> Can you share your code snippet, how you are trying to save the
>>>>>> entity objects into cassandra.
>>>>>>
>>>>>> Thanks and Regards,
>>>>>> Goutham Reddy Aenugu.
>>>>>>
>>>>>> Regards
>>>>>> Goutham Reddy
>>>>>>
>>>>>> On Tue, Mar 13, 2018 at 3:42 PM, Faraz Mateen <fmat...@an10.io>
>>>>>> wrote:
>>>>>>
>>>>>>> Hi everyone,
>>>>>>>
>>>>>>> I seem to have hit a problem in which writing to cassandra through a
>>>>>>> python script fails and also occasionally causes cassandra node to 
>>>>>>> crash.
>>>>>>> Here are the details of my problem.
>>>>>>>
>>>>>>> I have a python based streaming application that reads data from
>>>>>>> kafka at a high rate and pushes it to cassandra through datastax's
>>>>>>> cassandra driver for python. My cassandra setup consists of 3 nodes and 
>>>>>>> a
>>>>>>> replication factor of 2. Problem is that my python application crashes
>>>>>>> after writing ~12000 records with the following error:
>>>>>>>
>>>>>>> Exception: Error from server: code=1100 [Coordinator node timed out 
>>>>>>> waiting for replica nodes' responses] message="Operation timed out - 
>>>>>>> received only 0 responses." info={'received_responses':
>>>>>>>  0, 'consistency': 'LOCAL_ONE', 'required_responses': 1}
>>>>>>>
>>>>>>> Sometimes the  python application crashes with this traceback:
>>>>>>>
>>>>>>> cassandra.OperationTimedOut: errors={'10.128.1.1': 'Client request 
>>>>>>> timeout. See Session.execute[_async](timeout)'}, last_host=10.128.1.1
>>>>>>>
>>>>>>> With the error above, one of the cassandra node crashes as well.
>>>>>>> When I look at cassandra system logs (/var/log/cassandra/system.log), I 
>>>>>>> see
>>>>>>> the following exception:
>>>>>>>
>>>>>>>
>>>>>>> https://gist.github.com/farazmateen/e7aa5749f963ad2293f8be0ca1ccdc22/e3fd274af32c20eb9f534849a31734dcd33745b4
>>>>>>>
>>>>>>> According to the suggestion in post linked below, I have set my JVM
>>>>>>> Heap size to 8GB but the problem still persists.:
>>>>>>> https://dzone.com/articles/diagnosing-and-fixing-cassandra-timeouts
>>>>>>>
>>>>>>> *Cluster:*
>>>>>>>
>>>>>>>    - Cassandra version 3.9
>>>>>>>    - 3 nodes, with 8 cores and 30GB of RAM each.
>>>>>>>    - Keyspace has a replication factor of 2.
>>>>>>>    - Write consistency is LOCAL_ONE
>>>>>>>    - MAX HEAP SIZE is set to 8GB.
>>>>>>>
>>>>>>> Any help will be greatly appreciated.
>>>>>>>
>>>>>>> --
>>>>>>> Faraz
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Faraz Mateen
>>>>
>>>
>>>
>>>
>>> --
>>> Faraz Mateen
>>>
>>>
>>
>>
>

Re: Fast Writes to Cassandra Failing Through Python Script

Reply via email to