[
https://issues.apache.org/jira/browse/SOLR-6517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Erick Erickson updated SOLR-6517:
---------------------------------
Attachment: SOLR-6517.patch
Reviewboard here: https://reviews.apache.org/r/26632/
Here's a patch for people to poke holes in. It is NOT ready to commit. I
started out with a really simple throttling mechanism and then went to a more
sophisiticated one. I wanted folks to have an opportunity to critique both
approaches, so they're both in this patch. Of course I'll pull one out before
committing.
The meat of the differences are in collectionsHandler.handleReassignLeadersA
and collectionsHandler.handleReassignLeadersB. Of the two, the B variant is my
favorite by far. I hope to commit this late next week...
In one approach (the original crude one, see
collectionsHandler.handleReassignLeadersA), the parameter "maxToReassign" just
queues up the indicated number of leader reassignments and returns when they
are done. maxToReassign defaults to Integer.MAX_VALUE. The process here would
be to keep reassigning, say, 5 leaders until the collection was balanced. But
the onus is on the consumer to figure out when enough were done.
The other mode, collectionsHandler.handleReassignLeadersB also takes
"maxToReassign", but in this flavor it's the number of outstanding
reassignments to allow at once; defaults to Integer.MAX_VALUE. When the limit
is reached, the process waits until at least one of them completes, then queues
up enough to get back to that max. QUESTION: Is there a better way to find out
when an async process is done besides the poll/sleep loop in
collectionsHandler.waitForLeaderChange?
Additionally in this mode, maxToReassignWait is the number of seconds to wait
for reassignment to complete before giving up. It's a bail-out so the call
isn't stuck forever. Default value is 30 seconds. It's a little loose in that
even if it returns, the process may still be going on and _eventually_ complete
even if it bails out.
I should emphasize that only _one_ of the methods will make it to the final
patch, almost certainly the second one unless there are howls.
There's quite a bit of information returned in the result set, which is another
advantage of the second method. There's an example below, although it lacks the
"failures" node because there weren't any...:
I should also emphasize that I'm sure stuff will pop out when I look at it
fresh tomorrow, but the current form is enough to have people look at and poke
holes in.
Erick
Sample response (note, I'll get rid of the "reassignleaders_" prefix).
<response>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">523</int>
</lst>
<lst name="successes">
<lst name="reassignleaders_eoe_shard1_replica3">
<str name="status">success</str>
<str name="msg">
Assigned 'Collection: 'eoe', Shard: 'shard1', Core:
'eoe_shard1_replica3', BaseUrl:
'http://192.168.1.201:7600/solr'' to be leader
</str>
</lst>
<lst name="reassignleaders_eoe_shard2_replica4">
<str name="status">success</str>
<str name="msg">
Assigned 'Collection: 'eoe', Shard: 'shard2', Core:
'eoe_shard2_replica4', BaseUrl:
'http://192.168.1.201:7300/solr'' to be leader
</str>
</lst>
<lst name="reassignleaders_eoe_shard3_replica4">
<str name="status">success</str>
<str name="msg">
Assigned 'Collection: 'eoe', Shard: 'shard3', Core:
'eoe_shard3_replica4', BaseUrl:
'http://192.168.1.201:7400/solr'' to be leader
</str>
</lst>
<lst name="reassignleaders_eoe_shard4_replica4">
<str name="status">success</str>
<str name="msg">
Assigned 'Collection: 'eoe', Shard: 'shard4', Core:
'eoe_shard4_replica4', BaseUrl:
'http://192.168.1.201:8983/solr'' to be leader
</str>
</lst>
<lst name="reassignleaders_eoe_shard6_replica3">
<str name="status">success</str>
<str name="msg">
Assigned 'Collection: 'eoe', Shard: 'shard6', Core:
'eoe_shard6_replica3', BaseUrl:
'http://192.168.1.201:7500/solr'' to be leader
</str>
</lst>
</lst>
<lst name="alreadyLeaders">
<lst name="core_node21">
<str name="status">success</str>
<str name="msg">Already leader</str>
<str name="nodeName">192.168.1.201:7200_solr</str>
</lst>
</lst>
</response>
> CollectionsAPI call ELECTPREFERREDLEADERS
> -----------------------------------------
>
> Key: SOLR-6517
> URL: https://issues.apache.org/jira/browse/SOLR-6517
> Project: Solr
> Issue Type: New Feature
> Affects Versions: 5.0, Trunk
> Reporter: Erick Erickson
> Assignee: Erick Erickson
> Attachments: SOLR-6517.patch
>
>
> Perhaps the final piece of SOLR-6491. Once the preferred leadership roles are
> assigned, there has to be a command "make it so Mr. Solr". This is something
> of a placeholder to collect ideas. One wouldn't want to flood the system with
> hundreds of re-assignments at once. Should this be synchronous or asnych?
> Should it make the best attempt but not worry about perfection? Should it???
> a collection=name parameter would be required and it would re-elect all the
> leaders that were on the 'wrong' node
> I'm thinking an optionally allowing one to specify a shard in the case where
> you wanted to make a very specific change. Note that there's no need to
> specify a particular replica, since there should be only a single
> preferredLeader per slice.
> This command would do nothing to any slice that did not have a replica with a
> preferredLeader role. Likewise it would do nothing if the slice in question
> already had the leader role assigned to the node with the preferredLeader
> role.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]