[ https://issues.apache.org/jira/browse/SOLR-17363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jerry Chung updated SOLR-17363: ------------------------------- Description: Config request was submitted to update user property, but one of the replica's version was not updated until the solr service gets restarted. This seems to happen * When a replica was deleted, but the request handling node created a runner for the replica and waits for response. * All the replicas seem to be required to be reloaded upon updating user property, and only one replica can be reloaded at any time, so it is possible that not all the replicas for a collection can be reloaded within the given time (30 seconds). Client Side {{Caused by: org.apache.solr.client.solrj.impl.BaseHttpSolrClient$RemoteSolrException: Error from server at [https://ip-100-65-244-110.ec2.internal:8983/solr/mycollection/config?wt=javabin&version=2:] 1 out of 30 the property overlay to be of version 61 within 30 seconds! Failed cores: [https://ip-100-65-216-96.ec2.internal:8983/solr/mycollection_shard2_replica_n139/]}} {{ at org.apache.solr.client.solrj.impl.Http2SolrClient.processErrorsAndResponse(Http2SolrClient.java:920) ~[solr-solrj-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 - stillalex - 2023-10-10 19:10:39]}} {{ at org.apache.solr.client.solrj.impl.Http2SolrClient.processErrorsAndResponse(Http2SolrClient.java:576) ~[solr-solrj-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 - stillalex - 2023-10-10 19:10:39]}} {{ at org.apache.solr.client.solrj.impl.Http2SolrClient.request(Http2SolrClient.java:533) ~[solr-solrj-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 - stillalex - 2023-10-10 19:10:39]}} {{ at org.apache.solr.client.solrj.impl.LBSolrClient.doRequest(LBSolrClient.java:386) ~[solr-solrj-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 - stillalex - 2023-10-10 19:10:39]}} {{ at org.apache.solr.client.solrj.impl.LBSolrClient.request(LBSolrClient.java:352) ~[solr-solrj-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 - stillalex - 2023-10-10 19:10:39]}} {{ at fortiva.service.storage.index.CloudSolrClient.sendRequest(CloudSolrClient.java:) ~[xxx.jar:?]}} {{ at org.apache.solr.client.solrj.impl.CloudSolrClient.requestWithRetryOnStaleState(CloudSolrClient.java:898) ~[solr-solrj-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 - stillalex - 2023-10-10 19:10:39]}} {{ at org.apache.solr.client.solrj.impl.CloudSolrClient.request(CloudSolrClient.java:826) ~[solr-solrj-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 - stillalex - 2023-10-10 19:10:39]}} {{ at org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:234) ~[solr-solrj-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 - stillalex - 2023-10-10 19:10:39]}} on the request handling node (taken from different instance): {{2024-07-08 17:06:39.446 ERROR (qtp1043358826-23) [c:mycollection e3 s:shard3 r:core_node30 x:mycollection_shard3_replica_n29] o.a.s.s.HttpSolrCall 500 Exception => org.apache.solr.common.SolrException: 1 out of 29 the property overlay to be of version 3 within 30 seconds! Failed cores: [https://ip-100-65-239-139.ec2.internal:8983/solr/mycollection_shard8_replica_n75/] at org.apache.solr.handler.SolrConfigHandler.waitForAllReplicasState(SolrConfigHandler.java:895) org.apache.solr.common.SolrException: 1 out of 29 the property overlay to be of version 3 within 30 seconds! Failed cores: [https://ip-100-65-239-139.ec2.internal:8983/solr/mycollection_shard8_replica_n75/] at org.apache.solr.handler.SolrConfigHandler.waitForAllReplicasState(SolrConfigHandler.java:895) ~[solr-core-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 - stillalex - 2023-10-10 19:10:39] at org.apache.solr.handler.SolrConfigHandler$Command.handleCommands(SolrConfigHandler.java:594) ~[solr-core-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 - stillalex - 2023-10-10 19:10:39] at org.apache.solr.handler.SolrConfigHandler$Command.handlePOST(SolrConfigHandler.java:407) ~[solr-core-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 - stillalex - 2023-10-10 19:10:39] at org.apache.solr.handler.SolrConfigHandler.handleRequestBody(SolrConfigHandler.java:146) ~[solr-core-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 - stillalex - 2023-10-10 19:10:39] at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:226) ~[solr-core-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 - stillalex - 2023-10-10 19:10:39] at org.apache.solr.core.SolrCore.execute(SolrCore.java:2901) ~[solr-core-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 - stillalex - 2023-10-10 19:10:39]}} on the node where the replica was hosted (taken from the same time as above): {{2024-07-08 17:06:09.620 INFO (qtp1043358826-125) [ x:mycollection_shard8_replica_n75] c.p.s.s.FSCryptExecutor Policy 9deb2399f8f4d4964c6e867a08f90b2f for /data/solr/data/mycollection_shard8_replica_n75/data was removed}} {{{{}}{}}}2024-07-08 17:06:09.622 INFO (qtp1043358826-125) [ x:mycollection_shard8_replica_n75] o.a.s.c.SolrCore CLOSING SolrCore org.apache.solr.core.SolrCore@75e20647 mycollection_shard8_replica_n75 2024-07-08 17:06:09.622 INFO (qtp1043358826-125) [ x:mycollection_shard8_replica_n75] o.a.s.m.SolrMetricManager Closing metric reporters for registry=solr.core.mycollection.shard8.replica_n75 tag=SolrCore@75e20647 2024-07-08 17:06:09.622 INFO (qtp1043358826-125) [ x:mycollection_shard8_replica_n75] o.a.s.m.SolrMetricManager Closing metric reporters for registry=solr.collection.mycollection.shard8.leader tag=SolrCore@75e20647 2024-07-08 17:06:09.627 INFO (qtp1043358826-125) [ x:mycollection_shard8_replica_n75] o.a.s.u.DirectUpdateHandler2 Committing on IndexWriter.close() ... SKIPPED (unnecessary). was: Config request was submitted to update user property, but one of the replica's version was not updated until the solr service gets restarted. This seems to happen when a replica was deleted, but the request handling node created a runner for the replica and waits for response. Client Side {{Caused by: org.apache.solr.client.solrj.impl.BaseHttpSolrClient$RemoteSolrException: Error from server at [https://ip-100-65-244-110.ec2.internal:8983/solr/mycollection/config?wt=javabin&version=2:] 1 out of 30 the property overlay to be of version 61 within 30 seconds! Failed cores: [https://ip-100-65-216-96.ec2.internal:8983/solr/mycollection_shard2_replica_n139/]}} {{ at org.apache.solr.client.solrj.impl.Http2SolrClient.processErrorsAndResponse(Http2SolrClient.java:920) ~[solr-solrj-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 - stillalex - 2023-10-10 19:10:39]}} {{ at org.apache.solr.client.solrj.impl.Http2SolrClient.processErrorsAndResponse(Http2SolrClient.java:576) ~[solr-solrj-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 - stillalex - 2023-10-10 19:10:39]}} {{ at org.apache.solr.client.solrj.impl.Http2SolrClient.request(Http2SolrClient.java:533) ~[solr-solrj-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 - stillalex - 2023-10-10 19:10:39]}} {{ at org.apache.solr.client.solrj.impl.LBSolrClient.doRequest(LBSolrClient.java:386) ~[solr-solrj-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 - stillalex - 2023-10-10 19:10:39]}} {{ at org.apache.solr.client.solrj.impl.LBSolrClient.request(LBSolrClient.java:352) ~[solr-solrj-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 - stillalex - 2023-10-10 19:10:39]}} {{ at fortiva.service.storage.index.CloudSolrClient.sendRequest(CloudSolrClient.java:) ~[xxx.jar:?]}} {{ at org.apache.solr.client.solrj.impl.CloudSolrClient.requestWithRetryOnStaleState(CloudSolrClient.java:898) ~[solr-solrj-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 - stillalex - 2023-10-10 19:10:39]}} {{ at org.apache.solr.client.solrj.impl.CloudSolrClient.request(CloudSolrClient.java:826) ~[solr-solrj-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 - stillalex - 2023-10-10 19:10:39]}} {{ at org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:234) ~[solr-solrj-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 - stillalex - 2023-10-10 19:10:39]}} on the request handling node (taken from different instance): {{2024-07-08 17:06:39.446 ERROR (qtp1043358826-23) [c:mycollection e3 s:shard3 r:core_node30 x:mycollection_shard3_replica_n29] o.a.s.s.HttpSolrCall 500 Exception => org.apache.solr.common.SolrException: 1 out of 29 the property overlay to be of version 3 within 30 seconds! Failed cores: [https://ip-100-65-239-139.ec2.internal:8983/solr/mycollection_shard8_replica_n75/] at org.apache.solr.handler.SolrConfigHandler.waitForAllReplicasState(SolrConfigHandler.java:895) org.apache.solr.common.SolrException: 1 out of 29 the property overlay to be of version 3 within 30 seconds! Failed cores: [https://ip-100-65-239-139.ec2.internal:8983/solr/mycollection_shard8_replica_n75/] at org.apache.solr.handler.SolrConfigHandler.waitForAllReplicasState(SolrConfigHandler.java:895) ~[solr-core-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 - stillalex - 2023-10-10 19:10:39] at org.apache.solr.handler.SolrConfigHandler$Command.handleCommands(SolrConfigHandler.java:594) ~[solr-core-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 - stillalex - 2023-10-10 19:10:39] at org.apache.solr.handler.SolrConfigHandler$Command.handlePOST(SolrConfigHandler.java:407) ~[solr-core-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 - stillalex - 2023-10-10 19:10:39] at org.apache.solr.handler.SolrConfigHandler.handleRequestBody(SolrConfigHandler.java:146) ~[solr-core-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 - stillalex - 2023-10-10 19:10:39] at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:226) ~[solr-core-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 - stillalex - 2023-10-10 19:10:39] at org.apache.solr.core.SolrCore.execute(SolrCore.java:2901) ~[solr-core-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 - stillalex - 2023-10-10 19:10:39]}} on the node where the replica was hosted (taken from the same time as above): {{2024-07-08 17:06:09.620 INFO (qtp1043358826-125) [ x:mycollection_shard8_replica_n75] c.p.s.s.FSCryptExecutor Policy 9deb2399f8f4d4964c6e867a08f90b2f for /data/solr/data/mycollection_shard8_replica_n75/data was removed}} {{{}{}}}2024-07-08 17:06:09.622 INFO (qtp1043358826-125) [ x:mycollection_shard8_replica_n75] o.a.s.c.SolrCore CLOSING SolrCore org.apache.solr.core.SolrCore@75e20647 mycollection_shard8_replica_n75 2024-07-08 17:06:09.622 INFO (qtp1043358826-125) [ x:mycollection_shard8_replica_n75] o.a.s.m.SolrMetricManager Closing metric reporters for registry=solr.core.mycollection.shard8.replica_n75 tag=SolrCore@75e20647 2024-07-08 17:06:09.622 INFO (qtp1043358826-125) [ x:mycollection_shard8_replica_n75] o.a.s.m.SolrMetricManager Closing metric reporters for registry=solr.collection.mycollection.shard8.leader tag=SolrCore@75e20647 2024-07-08 17:06:09.627 INFO (qtp1043358826-125) [ x:mycollection_shard8_replica_n75] o.a.s.u.DirectUpdateHandler2 Committing on IndexWriter.close() ... SKIPPED (unnecessary). > ConfigRequest fails until Solr gets restarted > --------------------------------------------- > > Key: SOLR-17363 > URL: https://issues.apache.org/jira/browse/SOLR-17363 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: config-api > Affects Versions: 9.4 > Reporter: Jerry Chung > Priority: Major > > Config request was submitted to update user property, but one of the > replica's version was not updated until the solr service gets restarted. > This seems to happen > * When a replica was deleted, but the request handling node created a runner > for the replica and waits for response. > * All the replicas seem to be required to be reloaded upon updating user > property, and only one replica can be reloaded at any time, so it is possible > that not all the replicas for a collection can be reloaded within the given > time (30 seconds). > > Client Side > {{Caused by: > org.apache.solr.client.solrj.impl.BaseHttpSolrClient$RemoteSolrException: > Error from server at > [https://ip-100-65-244-110.ec2.internal:8983/solr/mycollection/config?wt=javabin&version=2:] > 1 out of 30 the property overlay to be of version 61 within 30 seconds! > Failed cores: > [https://ip-100-65-216-96.ec2.internal:8983/solr/mycollection_shard2_replica_n139/]}} > {{ at > org.apache.solr.client.solrj.impl.Http2SolrClient.processErrorsAndResponse(Http2SolrClient.java:920) > ~[solr-solrj-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 - > stillalex - 2023-10-10 19:10:39]}} > {{ at > org.apache.solr.client.solrj.impl.Http2SolrClient.processErrorsAndResponse(Http2SolrClient.java:576) > ~[solr-solrj-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 - > stillalex - 2023-10-10 19:10:39]}} > {{ at > org.apache.solr.client.solrj.impl.Http2SolrClient.request(Http2SolrClient.java:533) > ~[solr-solrj-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 - > stillalex - 2023-10-10 19:10:39]}} > {{ at > org.apache.solr.client.solrj.impl.LBSolrClient.doRequest(LBSolrClient.java:386) > ~[solr-solrj-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 - > stillalex - 2023-10-10 19:10:39]}} > {{ at > org.apache.solr.client.solrj.impl.LBSolrClient.request(LBSolrClient.java:352) > ~[solr-solrj-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 - > stillalex - 2023-10-10 19:10:39]}} > {{ at > fortiva.service.storage.index.CloudSolrClient.sendRequest(CloudSolrClient.java:) > ~[xxx.jar:?]}} > {{ at > org.apache.solr.client.solrj.impl.CloudSolrClient.requestWithRetryOnStaleState(CloudSolrClient.java:898) > ~[solr-solrj-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 - > stillalex - 2023-10-10 19:10:39]}} > {{ at > org.apache.solr.client.solrj.impl.CloudSolrClient.request(CloudSolrClient.java:826) > ~[solr-solrj-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 - > stillalex - 2023-10-10 19:10:39]}} > {{ at > org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:234) > ~[solr-solrj-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 - > stillalex - 2023-10-10 19:10:39]}} > > on the request handling node (taken from different instance): > > {{2024-07-08 17:06:39.446 ERROR (qtp1043358826-23) [c:mycollection e3 > s:shard3 r:core_node30 x:mycollection_shard3_replica_n29] > o.a.s.s.HttpSolrCall 500 Exception => org.apache.solr.common.SolrException: 1 > out of 29 the property overlay to be of version 3 within 30 seconds! Failed > cores: > [https://ip-100-65-239-139.ec2.internal:8983/solr/mycollection_shard8_replica_n75/] > at > org.apache.solr.handler.SolrConfigHandler.waitForAllReplicasState(SolrConfigHandler.java:895) > org.apache.solr.common.SolrException: 1 out of 29 the property overlay to be > of version 3 within 30 seconds! Failed cores: > [https://ip-100-65-239-139.ec2.internal:8983/solr/mycollection_shard8_replica_n75/] > at > org.apache.solr.handler.SolrConfigHandler.waitForAllReplicasState(SolrConfigHandler.java:895) > ~[solr-core-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 - > stillalex - 2023-10-10 19:10:39] > at > org.apache.solr.handler.SolrConfigHandler$Command.handleCommands(SolrConfigHandler.java:594) > ~[solr-core-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 - > stillalex - 2023-10-10 19:10:39] > at > org.apache.solr.handler.SolrConfigHandler$Command.handlePOST(SolrConfigHandler.java:407) > ~[solr-core-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 - > stillalex - 2023-10-10 19:10:39] > at > org.apache.solr.handler.SolrConfigHandler.handleRequestBody(SolrConfigHandler.java:146) > ~[solr-core-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 - > stillalex - 2023-10-10 19:10:39] > at > org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:226) > ~[solr-core-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 - > stillalex - 2023-10-10 19:10:39] > at org.apache.solr.core.SolrCore.execute(SolrCore.java:2901) > ~[solr-core-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 - > stillalex - 2023-10-10 19:10:39]}} > > on the node where the replica was hosted (taken from the same time as above): > {{2024-07-08 17:06:09.620 INFO (qtp1043358826-125) [ > x:mycollection_shard8_replica_n75] c.p.s.s.FSCryptExecutor Policy > 9deb2399f8f4d4964c6e867a08f90b2f for > /data/solr/data/mycollection_shard8_replica_n75/data was removed}} > {{{{}}{}}}2024-07-08 17:06:09.622 INFO (qtp1043358826-125) [ > x:mycollection_shard8_replica_n75] o.a.s.c.SolrCore CLOSING SolrCore > org.apache.solr.core.SolrCore@75e20647 mycollection_shard8_replica_n75 > 2024-07-08 17:06:09.622 INFO (qtp1043358826-125) [ > x:mycollection_shard8_replica_n75] o.a.s.m.SolrMetricManager Closing metric > reporters for registry=solr.core.mycollection.shard8.replica_n75 > tag=SolrCore@75e20647 2024-07-08 17:06:09.622 INFO (qtp1043358826-125) [ > x:mycollection_shard8_replica_n75] o.a.s.m.SolrMetricManager Closing metric > reporters for registry=solr.collection.mycollection.shard8.leader > tag=SolrCore@75e20647 2024-07-08 17:06:09.627 INFO (qtp1043358826-125) [ > x:mycollection_shard8_replica_n75] o.a.s.u.DirectUpdateHandler2 Committing on > IndexWriter.close() ... SKIPPED (unnecessary). -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org