[
https://issues.apache.org/jira/browse/SOLR-8146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15381513#comment-15381513
]
Arcadius Ahouansou commented on SOLR-8146:
------------------------------------------
Hello Susheel.
This ticket is not fully implemented yet.
The attached patch is the very first version which does work, but relies on
passing regex as start up param to the SolrJClient in the format
{code}
-Dsolr.preferredQueryNodePattern=SOME_REGEX_MATCHING_A_SET_OF_SOLR_NODES
{code}
This approached worked well for us but:
- It does not look very elegant and
- it does not integrate well into the current code base.
so, a better way to do this is to use the snitch.
Unfortunately, due to changes in priority, I was not able to come back to
finish this work.
> Allowing SolrJ CloudSolrClient to have preferred replica for query/read
> -----------------------------------------------------------------------
>
> Key: SOLR-8146
> URL: https://issues.apache.org/jira/browse/SOLR-8146
> Project: Solr
> Issue Type: New Feature
> Components: clients - java
> Affects Versions: 5.3
> Reporter: Arcadius Ahouansou
> Attachments: SOLR-8146.patch, SOLR-8146.patch, SOLR-8146.patch
>
>
> h2. Backgrouds
> Currently, the CloudSolrClient randomly picks a replica to query.
> This is done by shuffling the list of live URLs to query then, picking the
> first item from the list.
> This ticket is to allow more flexibility and control to some extend which
> URLs will be picked up for queries.
> Note that this is for queries only and would not affect update/delete/admin
> operations.
> h2. Implementation
> The current patch uses regex pattern and moves to the top of the list of URLs
> only those matching the given regex specified by the system property
> {code}solr.preferredQueryNodePattern{code}
> Initially, I thought it may be good to have Solr nodes tagged with a string
> pattern (snitch?) and use that pattern for matching the URLs.
> Any comment, recommendation or feedback would be appreciated.
> h2. Use Cases
> There are many cases where the ability to choose the node where queries go
> can be very handy:
> h3. Special node for manual user queries and analytics
> One may have a SolrCLoud cluster where every node host the same set of
> collections with:
> - multiple large SolrCLoud nodes (L) used for production apps and
> - have 1 small node (S) in the same cluster with less ram/cpu used only for
> manual user queries, data export and other production issue investigation.
> This ticket would allow to configure the applications using SolrJ to query
> only the (L) nodes
> This use case is similar to the one described in SOLR-5501 raised by [~manuel
> lenormand]
> h3. Minimizing network traffic
>
> For simplicity, let's say that we have a SolrSloud cluster deployed on 2 (or
> N) separate racks: rack1 and rack2.
> On each rack, we have a set of SolrCloud VMs as well as a couple of client
> VMs querying solr using SolrJ.
> All solr nodes are identical and have the same number of collections.
> What we would like to achieve is:
> - clients on rack1 will by preference query only SolrCloud nodes on rack1,
> and
> - clients on rack2 will by preference query only SolrCloud nodes on rack2.
> - Cross-rack read will happen if and only if one of the racks has no
> available Solr node to serve a request.
> In other words, we want read operations to be local to a rack whenever
> possible.
> Note that write/update/delete/admin operations should not be affected.
> Note that in our use case, we have a cross DC deployment. So, replace
> rack1/rack2 by DC1/DC2
> Any comment would be very appreciated.
> Thanks.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]