[
https://issues.apache.org/jira/browse/SOLR-6816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14233700#comment-14233700
]
Timothy Potter commented on SOLR-6816:
--------------------------------------
Cool - been looking into this as well, nothing definitive yet but here's one
thing I've noticed:
CPU load is considerable higher on replicas than on leaders when doing
high-volume indexing with batched documents coming from the client. Basically
the batch gets broken up on the leader and then sent in mini-batches to
replicas. Thus, replicas are having to process many more update requests than
leaders to index the same documents. Check out these two graphs from one of my
tests:
replica - http://www.dropmocks.com/mHoPUx
leader - http://www.dropmocks.com/mHoWpX
Pretty clear that there is considerably higher load on the replica than on the
leader. This was done with a 1x2 collection each replica on a separate node.
Without replication, I indexed a 10M doc collection (my synthetic ones ~1K
each) at 7,225 docs per second. With replication, I got 4,626 per second ~ 36%
slower.
Behind the scenes, CUSS does some minimal buffering of the docs, so there are
many, many more requests sent from the leader to the replica. The updateHandler
stats tell a good story (basically the replica received 5x the number of update
requests than the leader for just 10M docs).
Leader:
requests:40,022
avgRequestsPerSecond:9.830068574096831
5minRateReqsPerSecond:0.09800683344335624
15minRateReqsPerSecond:2.9134044254494302
avgTimePerRequest:628.6526956285293
medianRequestTime:379.48604750000004
75thPcRequestTime:568.784846
95thPcRequestTime:1365.1776681499978
99thPcRequestTime:6501.922041030025
Replica:
requests:206,367
avgRequestsPerSecond:51.13560879471209
5minRateReqsPerSecond:0.514584592882959
15minRateReqsPerSecond:14.541814273402418
avgTimePerRequest:104.61283714253733
medianRequestTime:35.7488105
75thPcRequestTime:96.46166525
95thPcRequestTime:272.08549294999995
99thPcRequestTime:718.7258438000003
I've been experimenting with tweaking things like the pollQueueTime, queueSize,
runner count setup by StreamingSolrServers but haven't come up with a
definitive recipe for improving things ... still digging ;-)
> Review SolrCloud Indexing Performance.
> --------------------------------------
>
> Key: SOLR-6816
> URL: https://issues.apache.org/jira/browse/SOLR-6816
> Project: Solr
> Issue Type: Task
> Components: SolrCloud
> Reporter: Mark Miller
> Priority: Critical
> Attachments: SolrBench.pdf
>
>
> We have never really focused on indexing performance, just correctness and
> low hanging fruit. We need to vet the performance and try to address any
> holes.
> Note: A common report is that adding any replication is very slow.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]