[ 
https://issues.apache.org/jira/browse/SOLR-7121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14324993#comment-14324993
 ] 

Shawn Heisey commented on SOLR-7121:
------------------------------------

I see hard-coded comparison values in the new code.  If the threshold values do 
not come from configuration with documented defaults, I think it can be a 
serious problem.

I will also say that all time measurements should be using nanoTime, not 
currentTimeMillis.  Trust me when I say that the discussion has been done to 
death on this topic, and nanoTime is what you'll find everybody switching to, 
because currentTimeMillis is not monotonic.

http://stackoverflow.com/a/2979239

In general I like the ideas expressed here, though I haven't looked into how 
things are calculated, other than noticing currentTimeMillis, so I don't know 
if the approach is good.  I also haven't determined whether it is on or off by 
default - it should be off, so upgrading users are not surprised by completely 
new behavior.


> Solr nodes should go down based on configurable thresholds and not rely on 
> resource exhaustion
> ----------------------------------------------------------------------------------------------
>
>                 Key: SOLR-7121
>                 URL: https://issues.apache.org/jira/browse/SOLR-7121
>             Project: Solr
>          Issue Type: New Feature
>            Reporter: Sachin Goyal
>         Attachments: SOLR-7121.patch
>
>
> Currently, there is no way to control when a Solr node goes down.
> If the server is having high GC pauses or too many threads or is just getting 
> too many queries due to some bad load-balancer, the cores in the machine keep 
> on serving unless they exhaust the machine's resources and everything comes 
> to a stall.
> Such a slow-dying core can affect other cores as well by taking huge time to 
> serve their distributed queries.
> There should be a way to specify some threshold values beyond which the 
> targeted core can its ill-health and proactively go down to recover.
> When the load improves, the core should come up automatically.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to