Dani, It might be time to attach some instrumentation to one of your nodes. Finding out which classes are occupying the memory will help narrow the issue.
Are you using a lot of facets, grouping, or stats during your queries? Also, when you were doing Master/Slave, was that on the same version of Solr as you're using now in SolrCloud mode? -Scott On Mon, Aug 28, 2017 at 4:50 AM, Daniel Ortega <danielortegauf...@gmail.com> wrote: > Hi Scott, > > Yes, we think that our usage scenario falls into Index-Heavy/Query-Heavy > too. We have tested with several values in softcommit/hardcommit values > (from few seconds to minutes) with no appreciable improvements :( > > Thanks for your reply! > > - Daniel > > 2017-08-25 6:45 GMT+02:00 Scott Stults <sstu...@opensourceconnections.com > >: > > > Hi Dani, > > > > It seems like your use case falls into the Index-Heavy / Query-Heavy > > category, so you might try increasing your hard commit frequency to 15 > > seconds rather than 15 minutes: > > > > https://lucidworks.com/2013/08/23/understanding- > > transaction-logs-softcommit-and-commit-in-sorlcloud/ > > > > > > -Scott > > > > On Thu, Aug 24, 2017 at 10:03 AM, Daniel Ortega < > > danielortegauf...@gmail.com > > > wrote: > > > > > Hi Scott, > > > > > > In our indexing service we are using that client too > > > (org.apache.solr.client.solrj.impl.CloudSolrClient) :) > > > > > > This is out Update Request Processor chain configuration: > > > > > > <updateProcessor class="solr.processor.SignatureUpdateProcessorFactor > y" > > > name > > > ="signature"> <bool name="enabled">true</bool> <str > > name="signatureField"> > > > hash</str> <bool name="overwriteDupes">false</bool> <str name= > > > "signatureClass">solr.processor.Lookup3Signature</str> > > </updateProcessor> > > > < > > > updateRequestProcessorChain processor="signature" name="dedupe"> > > <processor > > > class="solr.LogUpdateProcessorFactory" /> <processor class= > > > "solr.RunUpdateProcessorFactory" /> </updateRequestProcessorChain> > <!-- > > > de-duplication process explained in: > > > https://cwiki.apache.org/confluence/display/solr/De-Duplication --> < > > > requestHandler name="/update" class="solr.UpdateRequestHandler" > <lst > > > name= > > > "defaults"> <str name="update.chain">dedupe</str> </lst> > > </requestHandler> > > > > > > Thanks for your reply :) > > > > > > - Dani > > > > > > 2017-08-24 14:49 GMT+02:00 Scott Stults <sstults@ > > opensourceconnections.com > > > >: > > > > > > > Hi Daniel, > > > > > > > > SolrJ has a few client implementations to choose from: > CloudSolrClient, > > > > ConcurrentUpdateSolrClient, HttpSolrClient, LBHttpSolrClient. You > said > > > your > > > > query service uses CloudSolrClient, but it would be good to verify > > which > > > > implementation your indexing service uses. > > > > > > > > One of the problems you might be having is with your deduplication > > step. > > > > Can you post your Update Request Processor Chain? > > > > > > > > > > > > -Scott > > > > > > > > > > > > On Wed, Aug 23, 2017 at 4:13 PM, Daniel Ortega < > > > > danielortegauf...@gmail.com> > > > > wrote: > > > > > > > > > Hi Scott, > > > > > > > > > > - *Can you describe the process that queries the DB and sends > records > > > to > > > > * > > > > > *Solr?* > > > > > > > > > > We are enqueueing ids during every ORACLE transaction (in > > > > insert/updates). > > > > > > > > > > An application dequeues every id and perform queries against dozen > of > > > > > tables in the relational model to retrieve the fields to build the > > > > > document. As we know that we are modifying the same ORACLE row in > > > > > different (but consecutive) transactions, we store only the last > > > version > > > > of > > > > > the modified documents in a map data structure. > > > > > > > > > > The application has a configurable interval to send the documents > > > stored > > > > in > > > > > the map to the update handler (we have tested different intervals > > from > > > > few > > > > > milliseconds to several seconds) using the SolrJ client. Actually > we > > > are > > > > > sending all the documents every 15 seconds. > > > > > > > > > > This application is developed using Java, Spring and Maven and we > > have > > > > > several instances. > > > > > > > > > > -* Is it a SolrJ-based application?* > > > > > > > > > > Yes, it is. We aren't using the last version of SolrJ client (we > are > > > > > currently using SolrJ v6.3.0). > > > > > > > > > > - *If it is, which client package are you using?* > > > > > > > > > > I don't know exactly what do you mean saying 'client package' :) > > > > > > > > > > - *How many documents do you send at once?* > > > > > > > > > > It depends on the defined interval described before and the number > of > > > > > transactions executed in our relational database. From dozens to > few > > > > > hundreds (and even thousands). > > > > > > > > > > - *Are you sending your indexing or query traffic through a load > > > > balancer?* > > > > > > > > > > We aren't using a load balancer for indexing, but we have all our > > Rest > > > > > Query services through an HAProxy (using 'leastconn' algorithm). > The > > > Rest > > > > > Query Services performs queries using the CloudSolrClient. > > > > > > > > > > Thanks for your reply, > > > > > if you need any further information don't hesitate to ask > > > > > > > > > > Daniel > > > > > > > > > > 2017-08-23 14:57 GMT+02:00 Scott Stults <sstults@ > > > > opensourceconnections.com > > > > > >: > > > > > > > > > > > Hi Daniel, > > > > > > > > > > > > Great background information about your setup! I've got just a > few > > > more > > > > > > questions: > > > > > > > > > > > > - Can you describe the process that queries the DB and sends > > records > > > to > > > > > > Solr? > > > > > > - Is it a SolrJ-based application? > > > > > > - If it is, which client package are you using? > > > > > > - How many documents do you send at once? > > > > > > - Are you sending your indexing or query traffic through a load > > > > balancer? > > > > > > > > > > > > If you're sending documents to each replica as fast as they can > > take > > > > > them, > > > > > > you might be seeing a bottleneck at the shard leaders. The SolrJ > > > > > > CloudSolrClient finds out from Zookeeper which nodes are the > shard > > > > > leaders > > > > > > and sends docs directly to them. > > > > > > > > > > > > > > > > > > -Scott > > > > > > > > > > > > On Tue, Aug 22, 2017 at 2:16 PM, Daniel Ortega < > > > > > > danielortegauf...@gmail.com> > > > > > > wrote: > > > > > > > > > > > > > *Main Problems* > > > > > > > > > > > > > > > > > > > > > We are involved in a migration from Solr Master/Slave > > > infrastructure > > > > to > > > > > > > SolrCloud infrastructure. > > > > > > > > > > > > > > > > > > > > > > > > > > > > The main problems that we have now are: > > > > > > > > > > > > > > > > > > > > > > > > > > > > - Excessive resources consumption: Currently we have 5 > > instances > > > > > with > > > > > > 80 > > > > > > > processors/768 GB RAM each instance using SSD Hard Disk > Drives > > > > that > > > > > > > doesn't > > > > > > > support the load that we have in the other architecture. In > > our > > > > > > > Master-Slave architecture we have only 7 Virtual Machines > with > > > > lower > > > > > > > specs > > > > > > > (4 processors and 16 GB each instance using SSD Hard Disk > > Drives > > > > > too). > > > > > > > So, > > > > > > > at the moment our SolrCloud infrastructure is wasting > several > > > > dozen > > > > > > > times > > > > > > > more resources than our Solr Master/Slave infrastructure. > > > > > > > - Despite spending more resources we have worst query times > > > > > (compared > > > > > > to > > > > > > > Solr in master/slave architecture) > > > > > > > > > > > > > > > > > > > > > *Search infrastructure (SolrCloud infrastructure)* > > > > > > > > > > > > > > > > > > > > > > > > > > > > As we cannot use DIH Handler (which is what we use in Solr > > > > > Master/Slave), > > > > > > > we > > > > > > > have developed an application which reads every transaction > from > > > > > Oracle, > > > > > > > builds a document collection searching in the database and > sends > > > the > > > > > > result > > > > > > > to the */update* handler every 200 milliseconds using SolrJ > > client. > > > > > This > > > > > > > application tries to delete the possible duplicates in each > > update > > > > > > window, > > > > > > > but we are using solr’s de-duplication techniques > > > > > > > <https://emea01.safelinks.protection.outlook.com/?url= > > > > > > > https%3A%2F%2Fcwiki.apache.org%2Fconfluence%2Fdisplay% > > > > > > > 2Fsolr%2FDe-Duplication&data=02%7C01%7Cdortega%40idealista.com > % > > > > > > > 7Cb169ea024abc4954927208d4bc6868eb% > > 7Cd78b7929c2a34897ae9a7d8f8dc1 > > > > > > > a1cf%7C0%7C0%7C636340604697721266&sdata=WEhzoHC1Bf77K706% > > > > > > > 2Fj2wIWOw5gzfOgsP1IPQESvMsqQ%3D&reserved=0> > > > > > > > too. > > > > > > > > > > > > > > > > > > > > > > > > > > > > We are indexing ~100 documents per second (with peaks of ~1000 > > > > > documents > > > > > > > per second). > > > > > > > > > > > > > > > > > > > > > > > > > > > > Every search query is centralized in other application which > > > exposes > > > > a > > > > > > DSL > > > > > > > behind a REST API and uses SolrJ client too to perform queries. > > We > > > > have > > > > > > > peaks of 2000 QPS. > > > > > > > > > > > > > > *Cluster structure **(SolrCloud infrastructure)* > > > > > > > > > > > > > > > > > > > > > > > > > > > > At the moment, the cluster has 30 SolrCloud instances with the > > same > > > > > specs > > > > > > > (Same physical hosts, same JVM Settings, etc.). > > > > > > > > > > > > > > > > > > > > > > > > > > > > *Main collection* > > > > > > > > > > > > > > > > > > > > > > > > > > > > In our use case we are using this collection as a NoSQL > database > > > > > > basically. > > > > > > > Our document is composed of about 300 fields that represents an > > > > advert, > > > > > > and > > > > > > > is a denormalization of its relational representation in > Oracle. > > > > > > > > > > > > > > > > > > > > > We are using all our nodes to store the collection in 3 > shards. > > > So, > > > > > each > > > > > > > shard has 10 replicas. > > > > > > > > > > > > > > > > > > > > > At the moment, we are only indexing a subset of the adverts > > stored > > > in > > > > > > > Oracle, but our goal is to store all the ads that we have in > the > > DB > > > > (a > > > > > > few > > > > > > > tens of millions of documents). We have NRT requirements, so we > > > need > > > > to > > > > > > > index every document as soon as posible once it’s changed in > > > Oracle. > > > > > > > > > > > > > > > > > > > > > > > > > > > > We have defined the properties of each field (if it’s > > > stored/indexed > > > > or > > > > > > > not, if should be defined as DocValue, etc…) considering the > use > > of > > > > > that > > > > > > > field. > > > > > > > > > > > > > > > > > > > > > > > > > > > > *Index size **(SolrCloud infrastructure)* > > > > > > > > > > > > > > > > > > > > > > > > > > > > The index size is currently above 6 GB, storing 1.300.000 > > documents > > > > in > > > > > > each > > > > > > > shard. So, we are storing 3.900.000 documents and the total > index > > > > size > > > > > is > > > > > > > 18 GB. > > > > > > > > > > > > > > > > > > > > > > > > > > > > *Indexation **(SolrCloud infrastructure)* > > > > > > > > > > > > > > > > > > > > > > > > > > > > The commits *aren’t* triggered by the application described > > before. > > > > The > > > > > > > hardcommit/softcommit interval are configured in Solr: > > > > > > > > > > > > > > > > > > > > > > > > > > > > - *HardCommit:* every 15 minutes (with opensearcher = false) > > > > > > > - *SoftCommit:* every 5 seconds > > > > > > > > > > > > > > > > > > > > > > > > > > > > *Apache Solr Version* > > > > > > > > > > > > > > > > > > > > > > > > > > > > We are currently using the last version of Solr (6.6.0) under > an > > > > Oracle > > > > > > VM > > > > > > > (Java(TM) SE Runtime Environment (build 1.8.0_131-b11) Oracle > (64 > > > > > bits)) > > > > > > in > > > > > > > both deployments. > > > > > > > > > > > > > > > > > > > > > The question is... What is wrong here?!?!?! > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > Scott Stults | Founder & Solutions Architect | OpenSource > > > Connections, > > > > > LLC > > > > > > | 434.409.2780 > > > > > > http://www.opensourceconnections.com > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > Scott Stults | Founder & Solutions Architect | OpenSource > Connections, > > > LLC > > > > | 434.409.2780 > > > > http://www.opensourceconnections.com > > > > > > > > > > > > > > > -- > > Scott Stults | Founder & Solutions Architect | OpenSource Connections, > LLC > > | 434.409.2780 > > http://www.opensourceconnections.com > > > -- Scott Stults | Founder & Solutions Architect | OpenSource Connections, LLC | 434.409.2780 http://www.opensourceconnections.com