Re: Excessive resources consumption migrating from Solr 6.6.0 Master/Slave to SolrCloud 6.6.0 (dozen times more resources)

Scott Stults Thu, 24 Aug 2017 05:49:58 -0700

Hi Daniel,

SolrJ has a few client implementations to choose from: CloudSolrClient,
ConcurrentUpdateSolrClient, HttpSolrClient, LBHttpSolrClient. You said your
query service uses CloudSolrClient, but it would be good to verify which
implementation your indexing service uses.


One of the problems you might be having is with your deduplication step.
Can you post your Update Request Processor Chain?


-Scott


On Wed, Aug 23, 2017 at 4:13 PM, Daniel Ortega <danielortegauf...@gmail.com>
wrote:

> Hi Scott,
>
> - *Can you describe the process that queries the DB and sends records to *
> *Solr?*
>
> We are enqueueing ids during every ORACLE transaction (in insert/updates).
>
> An application dequeues every id and perform queries against dozen of
> tables in the relational model to retrieve the fields to build the
> document.  As we know that we are modifying the same ORACLE row in
> different (but consecutive) transactions, we store only the last version of
> the modified documents in a map data structure.
>
> The application has a configurable interval to send the documents stored in
> the map to the update handler (we have tested different intervals from few
> milliseconds to several seconds) using the SolrJ client. Actually we are
> sending all the documents every 15 seconds.
>
> This application is developed using Java, Spring and Maven and we have
> several instances.
>
> -* Is it a SolrJ-based application?*
>
> Yes, it is. We aren't using the last version of SolrJ client (we are
> currently using SolrJ v6.3.0).
>
> - *If it is, which client package are you using?*
>
> I don't know exactly what do you mean saying 'client package' :)
>
> - *How many documents do you send at once?*
>
> It depends on the defined interval described before and the number of
> transactions executed in our relational database. From dozens to few
> hundreds (and even thousands).
>
> - *Are you sending your indexing or query traffic through a load balancer?*
>
> We aren't using a load balancer for indexing, but we have all our Rest
> Query services through an HAProxy (using 'leastconn' algorithm). The Rest
> Query Services performs queries using the CloudSolrClient.
>
> Thanks for your reply,
> if you need any further information don't hesitate to ask
>
> Daniel
>
> 2017-08-23 14:57 GMT+02:00 Scott Stults <sstu...@opensourceconnections.com
> >:
>
> > Hi Daniel,
> >
> > Great background information about your setup! I've got just a few more
> > questions:
> >
> > - Can you describe the process that queries the DB and sends records to
> > Solr?
> > - Is it a SolrJ-based application?
> > - If it is, which client package are you using?
> > - How many documents do you send at once?
> > - Are you sending your indexing or query traffic through a load balancer?
> >
> > If you're sending documents to each replica as fast as they can take
> them,
> > you might be seeing a bottleneck at the shard leaders. The SolrJ
> > CloudSolrClient finds out from Zookeeper which nodes are the shard
> leaders
> > and sends docs directly to them.
> >
> >
> > -Scott
> >
> > On Tue, Aug 22, 2017 at 2:16 PM, Daniel Ortega <
> > danielortegauf...@gmail.com>
> > wrote:
> >
> > > *Main Problems*
> > >
> > >
> > > We are involved in a migration from Solr Master/Slave infrastructure to
> > > SolrCloud infrastructure.
> > >
> > >
> > >
> > > The main problems that we have now are:
> > >
> > >
> > >
> > >    - Excessive resources consumption: Currently we have 5 instances
> with
> > 80
> > >    processors/768 GB RAM each instance using SSD Hard Disk Drives that
> > > doesn't
> > >    support the load that we have in the other architecture. In our
> > >    Master-Slave architecture we have only 7 Virtual Machines with lower
> > > specs
> > >    (4 processors and 16 GB each instance using SSD Hard Disk Drives
> too).
> > > So,
> > >    at the moment our SolrCloud infrastructure is wasting several dozen
> > > times
> > >    more resources than our Solr Master/Slave infrastructure.
> > >    - Despite spending more resources we have worst query times
> (compared
> > to
> > >    Solr in master/slave architecture)
> > >
> > >
> > > *Search infrastructure (SolrCloud infrastructure)*
> > >
> > >
> > >
> > > As we cannot use DIH Handler (which is what we use in Solr
> Master/Slave),
> > > we
> > > have developed an application which reads every transaction from
> Oracle,
> > > builds a document collection searching in the database and sends the
> > result
> > > to the */update* handler every 200 milliseconds using SolrJ client.
> This
> > > application tries to delete the possible duplicates in each update
> > window,
> > > but we are using solr’s de-duplication techniques
> > > <https://emea01.safelinks.protection.outlook.com/?url=
> > > https%3A%2F%2Fcwiki.apache.org%2Fconfluence%2Fdisplay%
> > > 2Fsolr%2FDe-Duplication&data=02%7C01%7Cdortega%40idealista.com%
> > > 7Cb169ea024abc4954927208d4bc6868eb%7Cd78b7929c2a34897ae9a7d8f8dc1
> > > a1cf%7C0%7C0%7C636340604697721266&sdata=WEhzoHC1Bf77K706%
> > > 2Fj2wIWOw5gzfOgsP1IPQESvMsqQ%3D&reserved=0>
> > >  too.
> > >
> > >
> > >
> > > We are indexing ~100 documents per second (with peaks of ~1000
> documents
> > > per second).
> > >
> > >
> > >
> > > Every search query is centralized in other application which exposes a
> > DSL
> > > behind a REST API and uses SolrJ client too to perform queries. We have
> > > peaks of 2000 QPS.
> > >
> > > *Cluster structure **(SolrCloud infrastructure)*
> > >
> > >
> > >
> > > At the moment, the cluster has 30 SolrCloud instances with the same
> specs
> > > (Same physical hosts, same JVM Settings, etc.).
> > >
> > >
> > >
> > > *Main collection*
> > >
> > >
> > >
> > > In our use case we are using this collection as a NoSQL database
> > basically.
> > > Our document is composed of about 300 fields that represents an advert,
> > and
> > > is a denormalization of its relational representation in Oracle.
> > >
> > >
> > > We are using all our nodes to store the  collection in 3 shards. So,
> each
> > > shard has 10 replicas.
> > >
> > >
> > > At the moment, we are only indexing a subset of the adverts stored in
> > > Oracle, but our goal is to store all the ads that we have in the DB (a
> > few
> > > tens of millions of documents). We have NRT requirements, so we need to
> > > index every document as soon as posible once it’s changed in Oracle.
> > >
> > >
> > >
> > > We have defined the properties of each field (if it’s stored/indexed or
> > > not, if should be defined as DocValue, etc…) considering the use of
> that
> > > field.
> > >
> > >
> > >
> > > *Index size **(SolrCloud infrastructure)*
> > >
> > >
> > >
> > > The index size is currently above 6 GB, storing 1.300.000 documents in
> > each
> > > shard. So, we are storing 3.900.000 documents and the total index size
> is
> > > 18 GB.
> > >
> > >
> > >
> > > *Indexation **(SolrCloud infrastructure)*
> > >
> > >
> > >
> > > The commits *aren’t* triggered by the application described before. The
> > > hardcommit/softcommit interval are configured in Solr:
> > >
> > >
> > >
> > >    - *HardCommit:* every 15 minutes (with opensearcher = false)
> > >    - *SoftCommit:* every 5 seconds
> > >
> > >
> > >
> > > *Apache Solr Version*
> > >
> > >
> > >
> > > We are currently using the last version of Solr (6.6.0) under an Oracle
> > VM
> > > (Java(TM) SE Runtime Environment (build 1.8.0_131-b11) Oracle (64
> bits))
> > in
> > > both deployments.
> > >
> > >
> > > The question is... What is wrong here?!?!?!
> > >
> >
> >
> >
> > --
> > Scott Stults | Founder & Solutions Architect | OpenSource Connections,
> LLC
> > | 434.409.2780
> > http://www.opensourceconnections.com
> >
>



-- 
Scott Stults | Founder & Solutions Architect | OpenSource Connections, LLC
| 434.409.2780
http://www.opensourceconnections.com

Re: Excessive resources consumption migrating from Solr 6.6.0 Master/Slave to SolrCloud 6.6.0 (dozen times more resources)

Reply via email to