[
https://issues.apache.org/jira/browse/SOLR-8416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15063934#comment-15063934
]
Mark Miller commented on SOLR-8416:
-----------------------------------
Thanks Michael,
* Looks like a bunch of imports were moved above the license header?
* We probably want to use real solr.xml config for this. Or make it params for
the collection create call with reasonable defaults. We generally only use
system properties for kind of internal fail safe options we don't expect to
really be used. I'd be fine with reasonable defaults that could be overridden
per collection create call, but we could also allow the defaults to be
configurable via solr.xml.
{code}
+ Integer numRetries =
Integer.getInteger("createCollectionWaitTimeTillActive", 10);
+ Boolean checkLeaderOnly =
Boolean.getBoolean("createCollectionCheckLeaderActive");
{code}
* We should handle the checked exceptions this might throw like we do in other
spots rather than use a catch-all Exception. There should be plenty of code to
reference where we handle keeper and interrupted exception and do the right
thing for each.
{code}
+ try {
+ zkStateReader.updateClusterState();
+ clusterState = zkStateReader.getClusterState();
+ } catch (Exception e) {
+ throw new SolrException(ErrorCode.SERVER_ERROR, "Can't connect to zk
server", e);
+ }
{code}
* I'd probably combine the following into one IF statement:
{code}
+ if (!clusterState.liveNodesContain(replica.getNodeName())) {
+ replicaNotAlive = replica.getCoreUrl();
+ nodeNotLive = replica.getNodeName();
+ break;
+ }
+ if (!state.equals(Replica.State.ACTIVE.toString())) {
+ replicaNotAlive = replica.getCoreUrl();
+ replicaState = state;
+ break;
+ }
{code}
* Should probably restore interrupt status and throw a SolrException.
{code}
+ try {
+ Thread.sleep(1000);
+ } catch (InterruptedException e) {
+ Thread.currentThread().interrupt();
+ }
{code}
* I'm not sure the return message is quite right. If a nodes state is not
ACTIVE, it does not mean it's not Live. It can be DOWN and live or RECOVERING
and Live, etc. A replica is either Live or not and then has a Live State if and
only if it is Live.
* Needs some tests.
> Solr collection creation API should return after all cores are alive
> ---------------------------------------------------------------------
>
> Key: SOLR-8416
> URL: https://issues.apache.org/jira/browse/SOLR-8416
> Project: Solr
> Issue Type: Bug
> Components: SolrCloud
> Reporter: Michael Sun
> Attachments: SOLR-8416.patch, SOLR-8416.patch, SOLR-8416.patch
>
>
> Currently the collection creation API returns once all cores are created. In
> large cluster the cores may not be alive for some period of time after cores
> are created. For any thing requested during that period, Solr appears
> unstable and can return failure. Therefore it's better the collection
> creation API waits for all cores to become alive and returns after that.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]