Well we are able to do the tracing under normal load, but not yet able to
turn on tracing on demand during heavy load from client side(due to hard to
predict traffic pattern).

under normal load we saw most of the time query spent (in one particular
row we focus on) between
merging data from memtables and (2-3) sstables
Read 10xx live cell and 2x tomstones cell.

Our cql basically pull out one row that has about 1000 columns(approx. 800k
size of data). This table already in level compaction.

But once we get a series of exact same cql(against same row), the response
time start to dramatically degraded from normal 300-500ms to like 1 sec or
4 sec.
Other part of the system seems remain fine, no obvious latency spike In
read/write within the same keyspace or different keyspace.

So I wonder what is causing the sudden increase in latency of exact same
cql? what do we saturated ? if we saturated the disk IO, other part of the
tables will see similar effect but we didn't see it.
is there any table specific factor may contribute to the slowness?

thanks








On Mon, Nov 10, 2014 at 7:21 AM, DuyHai Doan <doanduy...@gmail.com> wrote:

> As Jonathan said, it's better to activate query tracing client side. It'll
> give you better flexibility of when to turn on & off tracing and on which
> table. Server-side tracing is global (all tables) and probabilistic, thus
> may not give satisfactory level of debugging.
>
>  Programmatically it's pretty simple to achieve and coupled with a good
> logging framework (LogBack for Java), you'll even have dynamic logging on
> production without having to redeploy client code. I have implemented it in
> Achilles very easily by wrapping over the Regular/Bound/Simple statements
> of Java driver and display the bound values at runtime :
> https://github.com/doanduyhai/Achilles/wiki/Statements-Logging-and-Tracing#dynamic-statements-logging
>
> On Mon, Nov 10, 2014 at 3:52 PM, Johnny Miller <johnny.p.mil...@gmail.com>
> wrote:
>
>> Be cautious enabling query tracing. Great tool for dev/testing/diagnosing
>> etc.. - but it does persist data to the system_traces keyspace with a TTL
>> of 24 hours and will, as a consequence, consume resources.
>>
>> http://www.datastax.com/dev/blog/advanced-request-tracing-in-cassandra-1-2
>>
>>
>> On 7 Nov 2014, at 20:20, Jonathan Haddad <j...@jonhaddad.com> wrote:
>>
>> Personally I've found that using query timing + log aggregation on the
>> client side is more effective than trying to mess with tracing probability
>> in order to find a single query which has recently become a problem.  I
>> recommend wrapping your session with something that can automatically log
>> the statement on a slow query, then use tracing to identify exactly what
>> happened.  This way finding your problem is not a matter of chance.
>>
>>
>>
>> On Fri Nov 07 2014 at 9:41:38 AM Chris Lohfink <clohfin...@gmail.com>
>> wrote:
>>
>>> It saves a lot of information for each request thats traced so there is
>>> significant overhead.  If you start at a low probability and move it up
>>> based on the load impact it will provide a lot of insight and you can
>>> control the cost.
>>>
>>> ---
>>> Chris Lohfink
>>>
>>> On Fri, Nov 7, 2014 at 11:35 AM, Jimmy Lin <y2klyf+w...@gmail.com>
>>> wrote:
>>>
>>>> is there any significant  performance penalty if one turn on Cassandra
>>>> query tracing, through DataStax java driver (say, per every query request
>>>> of some trouble query)?
>>>>
>>>> More sampling seems better but then doing so may also slow down the
>>>> system in some other ways?
>>>>
>>>> thanks
>>>>
>>>>
>>>>
>>>
>>
>

Reply via email to