Query the database to see whether the document is in the database.

The problem happens when you don’t follow the design pattern “single source of 
truth”. Solr has a delayed version of the true state, so it will sometimes give 
wrong answers.

https://en.wikipedia.org/wiki/Single_source_of_truth

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)

> On Feb 17, 2023, at 11:52 AM, Mark Hieber <hieb...@gmail.com> wrote:
> 
> We have a cluster of hosts running Solr 8.4 Each host has an application
> which listens to an external source for updated documents. When it gets a
> document we care about, it indexes that document into the correct Solr core
> (we are not running cloud).
> 
> In our API service, when we get a request to put this type of document, we
> first query Solr to see if the document exists. If it does not, we then
> create a new document in our database and the document is sent to the
> application to be indexed into Solr. If the document we are trying to *put* 
> (in
> the API Service) exists in Solr, then we throw an exception back to the
> user if they have not specified the existing version (not the _version_
> field from Solr, rather an increasing counter).
> 
> As part of our write logic, after we put the document into the database, we
> query the Solr stack until we get the response containing the newly written
> document. So we know the document was written at this point.
> 
> Some time later (maybe 5-10 minutes), we get another put request for the
> same document id. We query Solr, and in some cases, we get no documents
> returned, even though just before we actually found the document. The
> document has not been deleted in the interim.
> 
> We use the same query for both checking for existence at the beginning of
> the logic, and for checking for eventual consistency after writing to the
> database,
> 
> I could add a retry to the first part of the logic (retry if we don't find
> a document), but the question is why we don't find it the first time (but
> for the second put).
> 
> If I query for the document (using the same query), I find the document on
> each host.
> 
> Why are we not seeing documents which are actually there?

Reply via email to