Hello,

I was going to open a Solr bug, but I saw the message saying I should discuss 
this via another channel first. I have been attempting to use the incremental 
backup API on Solr 8.9.0, but while testing in our product we would 
occasionally get into a state where all subsequent backup attempts would fail. 
After some triage we found that it was happening to any collection which had 
undergone a shard split operation. If we did a backup, completed a shard split 
operation, then attempted another backup, the second backup would fail with a 
FileNotFound exception relating to the backup id of the second backup as the 
error message.


Steps to reproduce:

  *   Create a new collection with no associated backups
  *   Run a backup for this collection

     *   
/admin/collections?action=BACKUP&name=myBackupName&collection=myCollectionName&location=/path/to/my/shared/drive

  *   Run a shard split operation

     *   /admin/collections?action=SPLITSHARD&collection=name&shard=shardID

  *   Attempt another backup


Expected Outcome:

* If this operation is being blocked intentionally, then I would expect an 
informative error message explaining why it failed. Otherwise I would expect 
the backup to complete successfully.


Actual Outcome:

* The backup operation fails with a NoSuchFileException.

NOTE: In the below exception message the number in the file which isn’t found 
(in this case zk_backup_1) relates to the backup attempt which is currently 
being attempted.

{

  "responseHeader":{

    "status":500,

    "QTime":54},

  "failure":{

    
"MYIPADDRESS:31018_solr":"org.apache.solr.client.solrj.impl.BaseHttpSolrClient$RemoteSolrException:Error
 from server at null: Error handling 'BACKUPCORE' action"},

  "Operation backup caused 
exception:":"java.nio.file.NoSuchFileException:java.nio.file.NoSuchFileException:
 /opt/hci/solrBackups/reproCollectionBackup/reproCollection/zk_backup_1",

  "exception":{

    
"msg":"/opt/hci/solrBackups/reproCollectionBackup/reproCollection/zk_backup_1",

    "rspCode":-1},

  "error":{

    "metadata":[

      "error-class","org.apache.solr.common.SolrException",

      "root-error-class","org.apache.solr.common.SolrException"],

    
"msg":"/opt/hci/solrBackups/reproCollectionBackup/reproCollection/zk_backup_1",

    "trace":"org.apache.solr.common.SolrException: 
/opt/hci/solrBackups/reproCollectionBackup/reproCollection/zk_backup_1\n\tat 
org.apache.solr.client.solrj.SolrResponse.getException(SolrResponse.java:65)\n\tat
 
org.apache.solr.handler.admin.CollectionsHandler.invokeAction(CollectionsHandler.java:301)\n\tat
 
org.apache.solr.handler.admin.CollectionsHandler.handleRequestBody(CollectionsHandler.java:257)\n\tat
 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:216)\n\tat
 org.apache.solr.servlet.HttpSolrCall.handleAdmin(HttpSolrCall.java:836)\n\tat 
org.apache.solr.servlet.HttpSolrCall.handleAdminRequest(HttpSolrCall.java:800)\n\tat
 org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:545)\n\tat 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:427)\n\tat
 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:357)\n\tat
 org.eclipse.jetty.servlet.FilterHolder.doFilter(FilterHolder.java:201)\n\tat 
org.eclipse.jetty.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1601)\n\tat
 
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:548)\n\tat
 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)\n\tat
 
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:602)\n\tat
 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)\n\tat
 
org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:235)\n\tat
 
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1624)\n\tat
 
org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:233)\n\tat
 
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1435)\n\tat
 
org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:188)\n\tat
 
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:501)\n\tat 
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1594)\n\tat
 
org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:186)\n\tat
 
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1350)\n\tat
 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)\n\tat
 
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:191)\n\tat
 
org.eclipse.jetty.server.handler.InetAccessHandler.handle(InetAccessHandler.java:177)\n\tat
 
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:146)\n\tat
 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)\n\tat
 
org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:322)\n\tat
 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)\n\tat
 org.eclipse.jetty.server.Server.handle(Server.java:516)\n\tat 
org.eclipse.jetty.server.HttpChannel.lambda$handle$1(HttpChannel.java:388)\n\tat
 org.eclipse.jetty.server.HttpChannel.dispatch(HttpChannel.java:633)\n\tat 
org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:380)\n\tat 
org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:277)\n\tat
 
org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:311)\n\tat
 org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:105)\n\tat 
org.eclipse.jetty.io.ChannelEndPoint$1.run(ChannelEndPoint.java:104)\n\tat 
org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:336)\n\tat
 
org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:313)\n\tat
 
org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:171)\n\tat
 
org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:129)\n\tat
 
org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:383)\n\tat
 
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:882)\n\tat
 
org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:1036)\n\tat
 java.lang.Thread.run(Thread.java:748)\n",

    "code":500}}




I tried a few different workaround attempts, but after going through these 
steps I wasn’t able to run another backup for the collection.


Workaround attempt 1:

  *   Use the API to delete the backup

  *   Used the API to purge unused backup files

  *   Restarted Solr

  *   Attempted another backup

  *   Encountered the same failure


Workaround attempt 2:

  *   Deleted all files in my Solr backup mount location

  *   Restarted Solr

  *   Attempted another backup

  *   Encountered the same failure

Reply via email to