Hi Nick,
Thank you for following up with your results!
If you lose the leader, there are 5 other TLOG replicas ready to take
its place and they should be in-sync with the leader, so no docs would
be lost. Although they are not flushed to disk in the index (because
the autoCommit hasn't fired), th
Hello,
The results of the tests proved the hypothesis around autoCommit frequency and
TLOG performance degradation.
I’ve ran the following tests:
1. TLOG + PULL
34 replicas in total. 28 of them PULL, 6 TLOG. Query params:
replica.type:PULL,replica.type:TLOG,replica.location:local.
Increased
So the issue seems to be with the autocommit time.
The PULL and TLOG followers fetch the index every x seconds. This 'x' is
1/2 of the autocommit time, so when you increased your autocommit, you were
actually just increasing the amount of time your TLOG followers and PULL
replicas were able to kee
Very good point Mike. To avoid this, I’ve scaled the cluster to 34 nodes (which
would compensate the 6TLOG that aren’t going to be used for search), and we
were only 2 nodes less for search queries than the NRT cluster had. At lower
request rate, the results weren’t better either.
TLOG replica
When you have 6TLOG+24PULL and you're setting
shards.preference=replica.type:PULL,replica.type:TLOG,replica.location:local,
I would expect zero queries going to the TLOG replicas, can you
confirm that is the case? If so, this might be an issue of 24 nodes
trying to keep up with the work that 30 wer
actually not using HDFSDirectory, it’s a leftover in the config from some
previous tests.
I don’t see anything in the logs related to maxWarmingSearchers, nor other
errors/warnings show in the logs. I tried to reduce maxWarmingSearchers to 3
and increased the Hard commit maxTime to 2mins, the r
Are you using HDFSDirectory to serve your indices? I noticed that
tlogDfsReplication is set, so that's why I'm asking.
8 maxWarmingSearchers is very high, typically that value is 2 or maybe 4, but
you would know if this was an issue by looking at your logs.
I'm assuming that you had 30 NRT repl
Hi Tim,
thanks for your reply. Forgot to mention, I’ve tried with
shards.preference=replica.type:PULL,replica.type:TLOG,replica.location:local,
and the results are basically the same as with only replica.location:local or
without any additional query parameters. Sometimes, under heavier load, s
Hi Nick,
What does your response time look like if you use
shards.preference=replica.type:PULL,replica.location:local as a query
parameter? Basically route all queries to PULL replicas only.
LMK
Tim
On Fri, Jun 11, 2021 at 6:55 AM Nick Vladiceanu wrote:
>
> hello,
> I’m facing some performance