[ 
https://issues.apache.org/jira/browse/SOLR-7344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14493300#comment-14493300
 ] 

Hrishikesh Gadre commented on SOLR-7344:
----------------------------------------

Here is a high-level design. I have a reasonably working patch against Solr 
4.10.3 version. If there are no major objections to this proposal, I will 
prepare and submit a patch against the trunk.

- Define two separate end-points for Solr - one to handle internal requests 
(i.e. communication between Solr servers) and other for external requests (i.e. 
communication between clients and servers). Each of the end-point would be 
backed by a dedicated thread-pool.
- Define a property ‘externalPort’ in the solr.xml (under solrcloud 
configuration element) along with a similarly named Java system property. This 
property would define the port used by the external endpoint.
- Make appropriate changes in Solr such that,
  --> This property is published as part of the clusterstate.json ZNODE (along 
with the current base_url property which is used for internal requests).
  --> Change the solrj implementation to use this newly introduced property 
instead of base_url property (in the CloudSolrServer). If this newly introduced 
property is missing (e.g. new client connecting to old server), it will fall 
back to using the old property for backward compatibility.
  --> We don't need to change any other code on the server side (since it is 
using base_url property anyways).

If all external requests are sent to the external endpoint, a distributed 
deadlock can not occur since only threads associated with external endpoint 
will be doing the scatter/gather. And no two scatter/gather requests will 
directly depend upon each other. In the worst case, we can get a socket timeout 
error during the gather phase if too many internal requests are sent to a 
specific solr server. But we can not run into deadlock scenarios.

But the same can not be said if external requests also land on the internal 
endpoint. In this case one or more internal threads may be doing scatter/gather 
and hence would depend upon each other (just like today). Hence there is a 
possibility of distributed deadlock in this case. To prevent this from 
happening we should also add validation to ensure that the external requests 
sent to the internal endpoint are rejected.

This can be implemented by tagging internal requests in Solr (via an additional 
request parameter or a header) and adding validation via a servlet filter to 
reject external requests sent to the internal endpoint. To check if a request 
is sent to an internal endpoint, we can use the ServletRequest#getLocalPort() 
method.

Open questions
(1) For /admin/collections and /admin/cores APIs, we currently use information 
stored under live_nodes ZNODE. Each ZNODE under live_nodes is named as 
<host_name>:<port_number>_solr. The port number mentioned here corresponds to 
internal endpoint (used for solr server specific communication). What is the 
best way to add more information to it (e.g. external port value) ? may be as a 
content of the ZNODE?
https://github.com/apache/lucene-solr/blob/817303840fce547a1557e330e93e5a8ac0618f34/solr/solrj/src/java/org/apache/solr/client/solrj/impl/CloudSolrServer.java#L550

(2) What is your opinion on rejecting the external requests sent to internal 
endpoint? Any alternatives? 

> Use two thread pools, one for internal requests and one for external, to 
> avoid distributed deadlock and decrease the number of threads that need to be 
> created.
> ---------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-7344
>                 URL: https://issues.apache.org/jira/browse/SOLR-7344
>             Project: Solr
>          Issue Type: Improvement
>          Components: SolrCloud
>            Reporter: Mark Miller
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to