Hi Andy

Thanks for the response.

I have 2 DC's of 5 nodes each. I'm doing write against DC2 with local
quorum consistency. nodetool confirmed that all space is consumed on DC2
and I did not set any replication for DC1. This is the expected behavior.

Regarding node.js behavior, I validated that node is not responsible for
the delay since the application is deployed on a single server where
latencies are high(25%-40% higher).

I turned on trace and found that there is some communication going on
across DC, which introduced this delay.

Here is the attached events for one such session:

activity source source_elapsed thread
READ message received from /DC2-node3 DC1-node2 5
MessagingService-Incoming-/DC2-node3
Executing single-partition query on roles DC1-node2 179 ReadStage-2
Acquiring sstable references DC1-node2 212 ReadStage-2
Skipped 0/2 non-slice-intersecting sstables, included 0 due to tombstones
DC1-node2 247 ReadStage-2
Key cache hit for sstable 4 DC1-node2 273 ReadStage-2
Key cache hit for sstable 6 DC1-node2 301 ReadStage-2
Merged data from memtables and 2 sstables DC1-node2 352 ReadStage-2
Read 1 live rows and 0 tombstone cells DC1-node2 384 ReadStage-2
Read 1 live rows and 0 tombstone cells DC1-node2 411 ReadStage-2
Enqueuing response to /DC2-node3 DC1-node2 428 ReadStage-2
Sending REQUEST_RESPONSE message to /DC2-node3 DC1-node2 475
MessagingService-Outgoing-/DC2-node3-Small
REQUEST_RESPONSE message received from /DC1-node2 DC2-node3 119313
MessagingService-Incoming-/DC1-node2
Processing response from /DC1-node2 DC2-node3 119367 RequestResponseStage-3
reading digest from /DC2-node4 DC2-node3 119489 Native-Transport-Requests-8
reading digest from /DC2-node1 DC2-node3 119518 Native-Transport-Requests-8
reading digest from /DC1-node2 DC2-node3 119538 Native-Transport-Requests-8
Sending READ message to /DC2-node4 DC2-node3 119536
MessagingService-Outgoing-/DC2-node4-Small
Executing single-partition query on roles DC2-node3 119554 ReadStage-5
Sending READ message to /DC2-node1 DC2-node3 119562
MessagingService-Outgoing-/DC2-node1-Small
Acquiring sstable references DC2-node3 119570 ReadStage-5
Sending READ message to /DC1-node2 DC2-node3 119579
MessagingService-Outgoing-/DC1-node2-Small
Skipped 0/1 non-slice-intersecting sstables, included 0 due to tombstones
DC2-node3 119589 ReadStage-5
Key cache hit for sstable 1 DC2-node3 119609 ReadStage-5
Merged data from memtables and 1 sstables DC2-node3 119651 ReadStage-5
Read 1 live rows and 0 tombstone cells DC2-node3 119671 ReadStage-5
REQUEST_RESPONSE message received from /DC2-node4 DC2-node3 120682
MessagingService-Incoming-/DC2-node4
Processing response from /DC2-node4 DC2-node3 120714 RequestResponseStage-10
REQUEST_RESPONSE message received from /DC2-node1 DC2-node3 120995
MessagingService-Incoming-/DC2-node1
Processing response from /DC2-node1 DC2-node3 121088 RequestResponseStage-10
READ message received from /DC2-node3 DC1-node2 5
MessagingService-Incoming-/DC2-node3
Executing single-partition query on roles DC1-node2 67 ReadStage-4
Acquiring sstable references DC1-node2 90 ReadStage-4
Skipped 0/2 non-slice-intersecting sstables, included 0 due to tombstones
DC1-node2 127 ReadStage-4
Key cache hit for sstable 4 DC1-node2 165 ReadStage-4
Key cache hit for sstable 6 DC1-node2 201 ReadStage-4
Merged data from memtables and 2 sstables DC1-node2 267 ReadStage-4
Read 1 live rows and 0 tombstone cells DC1-node2 301 ReadStage-4
Read 1 live rows and 0 tombstone cells DC1-node2 318 ReadStage-4
Enqueuing response to /DC2-node3 DC1-node2 329 ReadStage-4
Sending REQUEST_RESPONSE message to /DC2-node3 DC1-node2 399
MessagingService-Outgoing-/DC2-node3-Small
MUTATION message received from /DC2-node3 DC2-node2 2
MessagingService-Incoming-/DC2-node3
MUTATION message received from /DC2-node3 DC2-node5 2
MessagingService-Incoming-/DC2-node3
REQUEST_RESPONSE message received from /DC1-node2 DC2-node3 226207
MessagingService-Incoming-/DC1-node2
Appending to commitlog DC2-node2 42 MutationStage-9
Processing response from /DC1-node2 DC2-node3 226315 RequestResponseStage-5
Adding to txn_game memtable DC2-node2 65 MutationStage-9
Determining replicas for mutation DC2-node3 226427
Native-Transport-Requests-8
Enqueuing response to /DC2-node3 DC2-node2 91 MutationStage-9
Appending to commitlog DC2-node3 226454 Native-Transport-Requests-8
Sending REQUEST_RESPONSE message to /DC2-node3 DC2-node2 137
MessagingService-Outgoing-/DC2-node3-Small
Adding to txn_game memtable DC2-node3 226475 Native-Transport-Requests-8
Sending MUTATION message to /DC2-node5 DC2-node3 226569
MessagingService-Outgoing-/DC2-node5-Small
Sending MUTATION message to /DC2-node2 DC2-node3 226570
MessagingService-Outgoing-/DC2-node2-Small
REQUEST_RESPONSE message received from /DC2-node2 DC2-node3 3
MessagingService-Incoming-/DC2-node2
Processing response from /DC2-node2 DC2-node3 40 RequestResponseStage-7
REQUEST_RESPONSE message received from /DC2-node5 DC2-node3 2
MessagingService-Incoming-/DC2-node5
Processing response from /DC2-node5 DC2-node3 97 RequestResponseStage-8


1) Why DC1 is at-all involved while performing local quorum write to DC2 ?
2) How to read source_elapsed? It seems cumulative per host, but somewhere
they reset.


Thanks
Tarun Chabarwal


On Sun, Nov 4, 2018 at 11:33 PM Andy Tolbert <andrew.tolb...@datastax.com>
wrote:

> Hi Tarun,
>
> There are a ton of factors that can impact query performance.
>
> The cassandra native protocol supports multiple simultaneous requests per
> connection.  Most drivers by default only create one connection to each C*
> host in the local data center.  That being said, that shouldn't be a
> problem, particularly if you are only executing 20 concurrent requests,
> this is something both driver clients and C* handles well.  The driver does
> do some write batching to reduce the amount of system calls, but I'm
> reasonably confident this is not an issue.
>
> It may be worth enabling client logging
> <https://github.com/datastax/nodejs-driver/#logging> to see if that
> shines any light.   You can also enable tracing on your requests by
> specifying traceQuery
> <https://docs.datastax.com/en/developer/nodejs-driver/3.5/api/type.QueryOptions/#type-query-options>
> as a query option (example
> <https://github.com/datastax/nodejs-driver/blob/master/examples/tracing/retrieve-query-trace.js>)
> to see if the delay is caused by C*-side processing.
>
> Also keep in mind that all user code in node.js is handled in a single
> thread.  If you have callbacks tied to your responses that do non-trivial
> work, that can delay subsequent requests from being processed, which may
> give impression that some queries are slow.
>
> Thanks,
> Andy
>
> On Sun, Nov 4, 2018 at 8:59 AM Tarun Chabarwal <tarunchabar...@gmail.com>
> wrote:
>
>> Hi
>>
>> I used cassandra driver provided by datastax (3.5.0) library in nodejs.
>> I've 5 nodes cluster. I'm writing to a table with quorum.
>>
>> I observed that there is some spike in write. In ~20 writes, 2-5 writes
>> are taking longer(~200ms). I debugged one of the node process with strace
>> and found that longer latencies are batched and they use same fd to connect
>> to cassandra. This may be the multiplexing.
>>
>> Why it takes that long ?
>> Where should I look to resolve it?
>>
>> Regards
>> Tarun Chabarwal
>>
>

Reply via email to