[ https://issues.apache.org/jira/browse/SOLR-17363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jerry Chung updated SOLR-17363: ------------------------------- Description: Config request was submitted to update user property, but one of the replica's version was not updated until the solr service gets restarted. This seems to happen * When a replica was deleted, but the request handling node created a runner for the replica and waits for response. * All the replicas seem to be required to be reloaded upon updating user property, and only one replica on a node can be reloaded at any time, so it is possible that not all the replicas for a collection can be reloaded within the given time (30 seconds). Client Side {{Caused by: org.apache.solr.client.solrj.impl.BaseHttpSolrClient$RemoteSolrException: Error from server at [https://ip-100-65-244-110.ec2.internal:8983/solr/mycollection/config?wt=javabin&version=2:] 1 out of 30 the property overlay to be of version 61 within 30 seconds! Failed cores: [https://ip-100-65-216-96.ec2.internal:8983/solr/mycollection_shard2_replica_n139/]}} {{ at org.apache.solr.client.solrj.impl.Http2SolrClient.processErrorsAndResponse(Http2SolrClient.java:920) ~[solr-solrj-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 - stillalex - 2023-10-10 19:10:39]}} {{ at org.apache.solr.client.solrj.impl.Http2SolrClient.processErrorsAndResponse(Http2SolrClient.java:576) ~[solr-solrj-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 - stillalex - 2023-10-10 19:10:39]}} {{ at org.apache.solr.client.solrj.impl.Http2SolrClient.request(Http2SolrClient.java:533) ~[solr-solrj-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 - stillalex - 2023-10-10 19:10:39]}} {{ at org.apache.solr.client.solrj.impl.LBSolrClient.doRequest(LBSolrClient.java:386) ~[solr-solrj-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 - stillalex - 2023-10-10 19:10:39]}} {{ at org.apache.solr.client.solrj.impl.LBSolrClient.request(LBSolrClient.java:352) ~[solr-solrj-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 - stillalex - 2023-10-10 19:10:39]}} {{ at fortiva.service.storage.index.CloudSolrClient.sendRequest(CloudSolrClient.java:) ~[xxx.jar:?]}} {{ at org.apache.solr.client.solrj.impl.CloudSolrClient.requestWithRetryOnStaleState(CloudSolrClient.java:898) ~[solr-solrj-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 - stillalex - 2023-10-10 19:10:39]}} {{ at org.apache.solr.client.solrj.impl.CloudSolrClient.request(CloudSolrClient.java:826) ~[solr-solrj-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 - stillalex - 2023-10-10 19:10:39]}} {{ at org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:234) ~[solr-solrj-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 - stillalex - 2023-10-10 19:10:39]}} on the request handling node (taken from different instance): {{2024-07-08 17:06:39.446 ERROR (qtp1043358826-23) [c:mycollection e3 s:shard3 r:core_node30 x:mycollection_shard3_replica_n29] o.a.s.s.HttpSolrCall 500 Exception => org.apache.solr.common.SolrException: 1 out of 29 the property overlay to be of version 3 within 30 seconds! Failed cores: [https://ip-100-65-239-139.ec2.internal:8983/solr/mycollection_shard8_replica_n75/] at org.apache.solr.handler.SolrConfigHandler.waitForAllReplicasState(SolrConfigHandler.java:895) org.apache.solr.common.SolrException: 1 out of 29 the property overlay to be of version 3 within 30 seconds! Failed cores: [https://ip-100-65-239-139.ec2.internal:8983/solr/mycollection_shard8_replica_n75/] at org.apache.solr.handler.SolrConfigHandler.waitForAllReplicasState(SolrConfigHandler.java:895) ~[solr-core-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 - stillalex - 2023-10-10 19:10:39] at org.apache.solr.handler.SolrConfigHandler$Command.handleCommands(SolrConfigHandler.java:594) ~[solr-core-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 - stillalex - 2023-10-10 19:10:39] at org.apache.solr.handler.SolrConfigHandler$Command.handlePOST(SolrConfigHandler.java:407) ~[solr-core-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 - stillalex - 2023-10-10 19:10:39] at org.apache.solr.handler.SolrConfigHandler.handleRequestBody(SolrConfigHandler.java:146) ~[solr-core-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 - stillalex - 2023-10-10 19:10:39] at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:226) ~[solr-core-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 - stillalex - 2023-10-10 19:10:39] at org.apache.solr.core.SolrCore.execute(SolrCore.java:2901) ~[solr-core-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 - stillalex - 2023-10-10 19:10:39]}} on the node where the replica was hosted (taken from the same time as above): {{{{}}{}}}2024-07-08 17:06:09.622 INFO (qtp1043358826-125) [ x:mycollection_shard8_replica_n75] o.a.s.c.SolrCore CLOSING SolrCore org.apache.solr.core.SolrCore@75e20647 mycollection_shard8_replica_n75 2024-07-08 17:06:09.622 INFO (qtp1043358826-125) [ x:mycollection_shard8_replica_n75] o.a.s.m.SolrMetricManager Closing metric reporters for registry=solr.core.mycollection.shard8.replica_n75 tag=SolrCore@75e20647 2024-07-08 17:06:09.622 INFO (qtp1043358826-125) [ x:mycollection_shard8_replica_n75] o.a.s.m.SolrMetricManager Closing metric reporters for registry=solr.collection.mycollection.shard8.leader tag=SolrCore@75e20647 2024-07-08 17:06:09.627 INFO (qtp1043358826-125) [ x:mycollection_shard8_replica_n75] o.a.s.u.DirectUpdateHandler2 Committing on IndexWriter.close() ... SKIPPED (unnecessary). was: Config request was submitted to update user property, but one of the replica's version was not updated until the solr service gets restarted. This seems to happen * When a replica was deleted, but the request handling node created a runner for the replica and waits for response. * All the replicas seem to be required to be reloaded upon updating user property, and only one replica on a node can be reloaded at any time, so it is possible that not all the replicas for a collection can be reloaded within the given time (30 seconds). Client Side {{Caused by: org.apache.solr.client.solrj.impl.BaseHttpSolrClient$RemoteSolrException: Error from server at [https://ip-100-65-244-110.ec2.internal:8983/solr/mycollection/config?wt=javabin&version=2:] 1 out of 30 the property overlay to be of version 61 within 30 seconds! Failed cores: [https://ip-100-65-216-96.ec2.internal:8983/solr/mycollection_shard2_replica_n139/]}} {{ at org.apache.solr.client.solrj.impl.Http2SolrClient.processErrorsAndResponse(Http2SolrClient.java:920) ~[solr-solrj-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 - stillalex - 2023-10-10 19:10:39]}} {{ at org.apache.solr.client.solrj.impl.Http2SolrClient.processErrorsAndResponse(Http2SolrClient.java:576) ~[solr-solrj-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 - stillalex - 2023-10-10 19:10:39]}} {{ at org.apache.solr.client.solrj.impl.Http2SolrClient.request(Http2SolrClient.java:533) ~[solr-solrj-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 - stillalex - 2023-10-10 19:10:39]}} {{ at org.apache.solr.client.solrj.impl.LBSolrClient.doRequest(LBSolrClient.java:386) ~[solr-solrj-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 - stillalex - 2023-10-10 19:10:39]}} {{ at org.apache.solr.client.solrj.impl.LBSolrClient.request(LBSolrClient.java:352) ~[solr-solrj-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 - stillalex - 2023-10-10 19:10:39]}} {{ at fortiva.service.storage.index.CloudSolrClient.sendRequest(CloudSolrClient.java:) ~[xxx.jar:?]}} {{ at org.apache.solr.client.solrj.impl.CloudSolrClient.requestWithRetryOnStaleState(CloudSolrClient.java:898) ~[solr-solrj-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 - stillalex - 2023-10-10 19:10:39]}} {{ at org.apache.solr.client.solrj.impl.CloudSolrClient.request(CloudSolrClient.java:826) ~[solr-solrj-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 - stillalex - 2023-10-10 19:10:39]}} {{ at org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:234) ~[solr-solrj-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 - stillalex - 2023-10-10 19:10:39]}} on the request handling node (taken from different instance): {{2024-07-08 17:06:39.446 ERROR (qtp1043358826-23) [c:mycollection e3 s:shard3 r:core_node30 x:mycollection_shard3_replica_n29] o.a.s.s.HttpSolrCall 500 Exception => org.apache.solr.common.SolrException: 1 out of 29 the property overlay to be of version 3 within 30 seconds! Failed cores: [https://ip-100-65-239-139.ec2.internal:8983/solr/mycollection_shard8_replica_n75/] at org.apache.solr.handler.SolrConfigHandler.waitForAllReplicasState(SolrConfigHandler.java:895) org.apache.solr.common.SolrException: 1 out of 29 the property overlay to be of version 3 within 30 seconds! Failed cores: [https://ip-100-65-239-139.ec2.internal:8983/solr/mycollection_shard8_replica_n75/] at org.apache.solr.handler.SolrConfigHandler.waitForAllReplicasState(SolrConfigHandler.java:895) ~[solr-core-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 - stillalex - 2023-10-10 19:10:39] at org.apache.solr.handler.SolrConfigHandler$Command.handleCommands(SolrConfigHandler.java:594) ~[solr-core-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 - stillalex - 2023-10-10 19:10:39] at org.apache.solr.handler.SolrConfigHandler$Command.handlePOST(SolrConfigHandler.java:407) ~[solr-core-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 - stillalex - 2023-10-10 19:10:39] at org.apache.solr.handler.SolrConfigHandler.handleRequestBody(SolrConfigHandler.java:146) ~[solr-core-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 - stillalex - 2023-10-10 19:10:39] at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:226) ~[solr-core-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 - stillalex - 2023-10-10 19:10:39] at org.apache.solr.core.SolrCore.execute(SolrCore.java:2901) ~[solr-core-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 - stillalex - 2023-10-10 19:10:39]}} on the node where the replica was hosted (taken from the same time as above): {{2024-07-08 17:06:09.620 INFO (qtp1043358826-125) [ x:mycollection_shard8_replica_n75] c.p.s.s.FSCryptExecutor Policy 9deb2399f8f4d4964c6e867a08f90b2f for /data/solr/data/mycollection_shard8_replica_n75/data was removed}} {{{{}}{}}}2024-07-08 17:06:09.622 INFO (qtp1043358826-125) [ x:mycollection_shard8_replica_n75] o.a.s.c.SolrCore CLOSING SolrCore org.apache.solr.core.SolrCore@75e20647 mycollection_shard8_replica_n75 2024-07-08 17:06:09.622 INFO (qtp1043358826-125) [ x:mycollection_shard8_replica_n75] o.a.s.m.SolrMetricManager Closing metric reporters for registry=solr.core.mycollection.shard8.replica_n75 tag=SolrCore@75e20647 2024-07-08 17:06:09.622 INFO (qtp1043358826-125) [ x:mycollection_shard8_replica_n75] o.a.s.m.SolrMetricManager Closing metric reporters for registry=solr.collection.mycollection.shard8.leader tag=SolrCore@75e20647 2024-07-08 17:06:09.627 INFO (qtp1043358826-125) [ x:mycollection_shard8_replica_n75] o.a.s.u.DirectUpdateHandler2 Committing on IndexWriter.close() ... SKIPPED (unnecessary). > ConfigRequest could fail when user property is updated > ------------------------------------------------------ > > Key: SOLR-17363 > URL: https://issues.apache.org/jira/browse/SOLR-17363 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: config-api > Affects Versions: 9.4 > Reporter: Jerry Chung > Priority: Major > > Config request was submitted to update user property, but one of the > replica's version was not updated until the solr service gets restarted. > This seems to happen > * When a replica was deleted, but the request handling node created a runner > for the replica and waits for response. > * All the replicas seem to be required to be reloaded upon updating user > property, and only one replica on a node can be reloaded at any time, so it > is possible that not all the replicas for a collection can be reloaded within > the given time (30 seconds). > > Client Side > {{Caused by: > org.apache.solr.client.solrj.impl.BaseHttpSolrClient$RemoteSolrException: > Error from server at > [https://ip-100-65-244-110.ec2.internal:8983/solr/mycollection/config?wt=javabin&version=2:] > 1 out of 30 the property overlay to be of version 61 within 30 seconds! > Failed cores: > [https://ip-100-65-216-96.ec2.internal:8983/solr/mycollection_shard2_replica_n139/]}} > {{ at > org.apache.solr.client.solrj.impl.Http2SolrClient.processErrorsAndResponse(Http2SolrClient.java:920) > ~[solr-solrj-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 - > stillalex - 2023-10-10 19:10:39]}} > {{ at > org.apache.solr.client.solrj.impl.Http2SolrClient.processErrorsAndResponse(Http2SolrClient.java:576) > ~[solr-solrj-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 - > stillalex - 2023-10-10 19:10:39]}} > {{ at > org.apache.solr.client.solrj.impl.Http2SolrClient.request(Http2SolrClient.java:533) > ~[solr-solrj-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 - > stillalex - 2023-10-10 19:10:39]}} > {{ at > org.apache.solr.client.solrj.impl.LBSolrClient.doRequest(LBSolrClient.java:386) > ~[solr-solrj-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 - > stillalex - 2023-10-10 19:10:39]}} > {{ at > org.apache.solr.client.solrj.impl.LBSolrClient.request(LBSolrClient.java:352) > ~[solr-solrj-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 - > stillalex - 2023-10-10 19:10:39]}} > {{ at > fortiva.service.storage.index.CloudSolrClient.sendRequest(CloudSolrClient.java:) > ~[xxx.jar:?]}} > {{ at > org.apache.solr.client.solrj.impl.CloudSolrClient.requestWithRetryOnStaleState(CloudSolrClient.java:898) > ~[solr-solrj-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 - > stillalex - 2023-10-10 19:10:39]}} > {{ at > org.apache.solr.client.solrj.impl.CloudSolrClient.request(CloudSolrClient.java:826) > ~[solr-solrj-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 - > stillalex - 2023-10-10 19:10:39]}} > {{ at > org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:234) > ~[solr-solrj-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 - > stillalex - 2023-10-10 19:10:39]}} > > on the request handling node (taken from different instance): > > {{2024-07-08 17:06:39.446 ERROR (qtp1043358826-23) [c:mycollection e3 > s:shard3 r:core_node30 x:mycollection_shard3_replica_n29] > o.a.s.s.HttpSolrCall 500 Exception => org.apache.solr.common.SolrException: 1 > out of 29 the property overlay to be of version 3 within 30 seconds! Failed > cores: > [https://ip-100-65-239-139.ec2.internal:8983/solr/mycollection_shard8_replica_n75/] > at > org.apache.solr.handler.SolrConfigHandler.waitForAllReplicasState(SolrConfigHandler.java:895) > org.apache.solr.common.SolrException: 1 out of 29 the property overlay to be > of version 3 within 30 seconds! Failed cores: > [https://ip-100-65-239-139.ec2.internal:8983/solr/mycollection_shard8_replica_n75/] > at > org.apache.solr.handler.SolrConfigHandler.waitForAllReplicasState(SolrConfigHandler.java:895) > ~[solr-core-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 - > stillalex - 2023-10-10 19:10:39] > at > org.apache.solr.handler.SolrConfigHandler$Command.handleCommands(SolrConfigHandler.java:594) > ~[solr-core-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 - > stillalex - 2023-10-10 19:10:39] > at > org.apache.solr.handler.SolrConfigHandler$Command.handlePOST(SolrConfigHandler.java:407) > ~[solr-core-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 - > stillalex - 2023-10-10 19:10:39] > at > org.apache.solr.handler.SolrConfigHandler.handleRequestBody(SolrConfigHandler.java:146) > ~[solr-core-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 - > stillalex - 2023-10-10 19:10:39] > at > org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:226) > ~[solr-core-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 - > stillalex - 2023-10-10 19:10:39] > at org.apache.solr.core.SolrCore.execute(SolrCore.java:2901) > ~[solr-core-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 - > stillalex - 2023-10-10 19:10:39]}} > > on the node where the replica was hosted (taken from the same time as above): > {{{{}}{}}}2024-07-08 17:06:09.622 INFO (qtp1043358826-125) [ > x:mycollection_shard8_replica_n75] o.a.s.c.SolrCore CLOSING SolrCore > org.apache.solr.core.SolrCore@75e20647 mycollection_shard8_replica_n75 > 2024-07-08 17:06:09.622 INFO (qtp1043358826-125) [ > x:mycollection_shard8_replica_n75] o.a.s.m.SolrMetricManager Closing metric > reporters for registry=solr.core.mycollection.shard8.replica_n75 > tag=SolrCore@75e20647 2024-07-08 17:06:09.622 INFO (qtp1043358826-125) [ > x:mycollection_shard8_replica_n75] o.a.s.m.SolrMetricManager Closing metric > reporters for registry=solr.collection.mycollection.shard8.leader > tag=SolrCore@75e20647 2024-07-08 17:06:09.627 INFO (qtp1043358826-125) [ > x:mycollection_shard8_replica_n75] o.a.s.u.DirectUpdateHandler2 Committing on > IndexWriter.close() ... SKIPPED (unnecessary). -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org