[ 
https://issues.apache.org/jira/browse/SOLR-7609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17716031#comment-17716031
 ] 

Alex Deparvu commented on SOLR-7609:
------------------------------------

Updating with the GitHub data as PR is close to being merged, for future 
reference

Changes done:
 * added the version check on additions to fail in case we are not leader and 
version = 0. (to match delete flows)
 * changed error status from BAD_REQUEST to INVALID_STATE to allow for retries. 
I was able to verify retries are happening [0]
 * removed a 'cmd' variable - this is just minor readability refactoring, I 
tried to avoid changing the code as much as possible
 * updated the ShardSplitTest to keep track of exceptions happening during the 
concurrent adds and deletes and fail if needed.
 * fixed wrong NPE check on 
[DistributedZkUpdateProcessor#getCollectionUrls|https://github.com/apache/solr/blob/db4cb66271f615da6a0a3ae6fed5fb2e184fd053/solr/core/src/java/org/apache/solr/update/processor/DistributedZkUpdateProcessor.java#L889]

 Things to followup later:
 * there is still one failure happening `Request says it is coming from parent 
shard leader but we are in active state`
 * noticed the setupRequest() method is usually called twice, I think this is 
easy to fix with a basic flag, I can add it if it doesn't grow the PR too much, 
or it can be done on a followup PR.
 * all over the class there is a pattern of checking read only status to 
prevent some operations I believe could be broken.
{code:java}
clusterState = zkController.getClusterState();
if (isReadOnly()) {
  throw new SolrException(ErrorCode.FORBIDDEN, "Collection " + collection + " 
is read-only.");
}
{code}
refreshing the clusterState is insufficient, because the isReadOnly is based on 
the readOnlyCollection flag that is only initialized at the beginning. if the 
intent was to have a fresh check, the readOnlyCollection flag needs to be 
updated too, based on the new clusterState

> ShardSplitTest NPE
> ------------------
>
>                 Key: SOLR-7609
>                 URL: https://issues.apache.org/jira/browse/SOLR-7609
>             Project: Solr
>          Issue Type: Bug
>            Reporter: Steven Rowe
>            Priority: Minor
>         Attachments: ShardSplitTest.NPE.log
>
>          Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> I'm guessing this is a test bug, but the seed doesn't reproduce for me (tried 
> on the same Linux machine it occurred on and on OS X):
> {noformat}
>    [junit4]   2> NOTE: reproduce with: ant test  -Dtestcase=ShardSplitTest 
> -Dtests.method=test -Dtests.seed=9318DDA46578ECF9 -Dtests.slow=true 
> -Dtests.locale=is -Dtests.timezone=America/St_Vincent -Dtests.asserts=true 
> -Dtests.file.encoding=US-ASCII
>    [junit4] ERROR   55.8s J6  | ShardSplitTest.test <<<
>    [junit4]    > Throwable #1: java.lang.NullPointerException
>    [junit4]    >      at 
> __randomizedtesting.SeedInfo.seed([9318DDA46578ECF9:1B4CE27ECB848101]:0)
>    [junit4]    >      at 
> org.apache.solr.cloud.ShardSplitTest.logDebugHelp(ShardSplitTest.java:547)
>    [junit4]    >      at 
> org.apache.solr.cloud.ShardSplitTest.checkDocCountsAndShardStates(ShardSplitTest.java:438)
>    [junit4]    >      at 
> org.apache.solr.cloud.ShardSplitTest.splitByUniqueKeyTest(ShardSplitTest.java:222)
>    [junit4]    >      at 
> org.apache.solr.cloud.ShardSplitTest.test(ShardSplitTest.java:84)
>    [junit4]    >      at 
> org.apache.solr.BaseDistributedSearchTestCase$ShardsRepeatRule$ShardsFixedStatement.callStatement(BaseDistributedSearchTestCase.java:960)
>    [junit4]    >      at 
> org.apache.solr.BaseDistributedSearchTestCase$ShardsRepeatRule$ShardsStatement.evaluate(BaseDistributedSearchTestCase.java:935)
>    [junit4]    >      at java.lang.Thread.run(Thread.java:745)
> {noformat}
> Line 547 of {{ShardSplitTest.java}} is:
> {code:java}
>       idVsVersion.put(document.getFieldValue("id").toString(), 
> document.getFieldValue("_version_").toString());
> {code}
> Skimming the code, it's not obvious what could be null.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Reply via email to