Hi Team,

We have built an Index Queue Mechanism where we store the Ids of the Documents 
that needs reindexing as some of the data has been changed recently by the user.
A Cron job runs in the background which keeps monitoring the queue every 5 
seconds and looks out for any new ids that are added to the queue. then it 
picks up those ids and tries to reindex them to solr.
For Reindexing, first it deletes the existing documents from solr, then it 
fetches the latest details from the database and then indexes it back to solr. 
For deleting, we use the deleteByQuery method. We could not use deleteById as 
fetching the Ids of the Docs is hard as they are uniquely generated by Solr 
itself. We are Committing the changes manually by calling the 
solrClient.commit(collectionName, true, true). Things were working pretty fine 
up until a few days ago. recently they have started failing for prod server.

I have been recently Facing an issue with one of my prod instances where I am 
constantly getting an error like
"Task queue processing has stalled for 20121 ms with 0 remaining elements to 
process".
my application is not able to perform any kind of indexing after this and even 
the search results are inconsistent now. The Same query is returning different 
results
every time we hit it.
I am not able to see the same above issue in my other test environment where I 
have a similar type of setup with the same amount of data.

Below are the configuration details of our Solr setup
Solr Version : 8.11.2
Solrj Version : 8.11.2
Solr Is Running in Cloud mode with 3 shards and 2 Replica architecture.

Some things that we noticed through logs and other forums is:

  1.  We are using deleteByQuery Method to delete the existing Solr Documents
  2.  We have not implemented autoCommit, autoSoftCommit, idleTimeout, 
socketTimeout, stallTimeout configurational settings.
  3.  we are doing everything using a manual hard commit through the solrj from 
my Application. This is done so that we can track the progress of how many 
documents are indexed and how many are remaining.

I saw a similar Issue existing before in solr 8.4 versions but that got 
resolved with solr 8.4. But I can still see the issue happening. I understand 
that I should not do manual commits but as this was our first release and we 
are only improving the setup from here, i wanted to know if there is any 
configuration that can be included to fix this error. Making the code change is 
not possible right now as giving the code change can take around month to reach 
the customer as the release but for now is there anything that we can do to fix 
this issue so that the indexing can start again in solr.

Thanks in Advance for the Help.


Rishabh Yadav
Software Engineer 1
Esko Graphics Pvt Ind Ltd


Please be advised that this email may contain confidential information. If you 
are not the intended recipient, please notify us by email by replying to the 
sender and delete this message. The sender disclaims that the content of this 
email constitutes an offer to enter into, or the acceptance of, any agreement; 
provided that the foregoing does not invalidate the binding effect of any 
digital or other electronic reproduction of a manual signature that is included 
in any attachment. [0xC3D2]

Confidential - Company Proprietary

Reply via email to