Excessive resources consumption migrating from Solr 6.6.0 Master/Slave to SolrCloud 6.6.0 (dozen times more resources)

Daniel Ortega Tue, 22 Aug 2017 11:17:12 -0700

*Main Problems*

We are involved in a migration from Solr Master/Slave infrastructure to
SolrCloud infrastructure.

The main problems that we have now are:

- Excessive resources consumption: Currently we have 5 instances with 80
processors/768 GB RAM each instance using SSD Hard Disk Drives that doesn't
support the load that we have in the other architecture. In our
Master-Slave architecture we have only 7 Virtual Machines with lower specs
(4 processors and 16 GB each instance using SSD Hard Disk Drives too). So,
at the moment our SolrCloud infrastructure is wasting several dozen times
more resources than our Solr Master/Slave infrastructure.
- Despite spending more resources we have worst query times (compared to
Solr in master/slave architecture)

*Search infrastructure (SolrCloud infrastructure)*

As we cannot use DIH Handler (which is what we use in Solr Master/Slave), we
have developed an application which reads every transaction from Oracle,
builds a document collection searching in the database and sends the result
to the */update* handler every 200 milliseconds using SolrJ client. This
application tries to delete the possible duplicates in each update window,
but we are using solr’s de-duplication techniques
<https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcwiki.apache.org%2Fconfluence%2Fdisplay%2Fsolr%2FDe-Duplication&data=02%7C01%7Cdortega%40idealista.com%7Cb169ea024abc4954927208d4bc6868eb%7Cd78b7929c2a34897ae9a7d8f8dc1a1cf%7C0%7C0%7C636340604697721266&sdata=WEhzoHC1Bf77K706%2Fj2wIWOw5gzfOgsP1IPQESvMsqQ%3D&reserved=0>
too.

We are indexing ~100 documents per second (with peaks of ~1000 documents
per second).

Every search query is centralized in other application which exposes a DSL
behind a REST API and uses SolrJ client too to perform queries. We have
peaks of 2000 QPS.

*Cluster structure **(SolrCloud infrastructure)*

At the moment, the cluster has 30 SolrCloud instances with the same specs
(Same physical hosts, same JVM Settings, etc.).

*Main collection*

In our use case we are using this collection as a NoSQL database basically.
Our document is composed of about 300 fields that represents an advert, and
is a denormalization of its relational representation in Oracle.

We are using all our nodes to store the collection in 3 shards. So, each
shard has 10 replicas.

At the moment, we are only indexing a subset of the adverts stored in
Oracle, but our goal is to store all the ads that we have in the DB (a few
tens of millions of documents). We have NRT requirements, so we need to
index every document as soon as posible once it’s changed in Oracle.

We have defined the properties of each field (if it’s stored/indexed or
not, if should be defined as DocValue, etc…) considering the use of that
field.

*Index size **(SolrCloud infrastructure)*

The index size is currently above 6 GB, storing 1.300.000 documents in each
shard. So, we are storing 3.900.000 documents and the total index size is
18 GB.

*Indexation **(SolrCloud infrastructure)*

The commits *aren’t* triggered by the application described before. The
hardcommit/softcommit interval are configured in Solr:

- *HardCommit:* every 15 minutes (with opensearcher = false)
- *SoftCommit:* every 5 seconds

*Apache Solr Version*

We are currently using the last version of Solr (6.6.0) under an Oracle VM
(Java(TM) SE Runtime Environment (build 1.8.0_131-b11) Oracle (64 bits)) in
both deployments.

The question is... What is wrong here?!?!?!

Excessive resources consumption migrating from Solr 6.6.0 Master/Slave to SolrCloud 6.6.0 (dozen times more resources)

Reply via email to