[
https://issues.apache.org/jira/browse/SOLR-9941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15806431#comment-15806431
]
Hoss Man commented on SOLR-9941:
--------------------------------
I'm guessing this situation exists because the {{RecentUpdates}} code was added
to {{UpdateLog}} to handle (true) out of order updates while a system was
"live", but it's not clear to me why the {{RecentUpdates}} are used the way
they are during {{UpdateLog.init}} ... it seems like at a minimum DUH2 could
inspect the UpdateCommand flags to see if this is a {{REPLAY}} command and if
it is skip the {{UpdateLog.getDBQNewer}} call?
Here's some steps that make it easy to see how weird this situation can get...
{noformat}
bin/solr -e techproducts
curl -X POST http://localhost:8983/solr/techproducts/config -H 'Content-Type:
application/json' --data-binary
'{"set-property":{"updateHandler.autoCommit.maxTime":"-1"}}'
curl -X POST http://localhost:8983/solr/techproducts/update -H 'Content-Type:
application/json' --data-binary '{
"add":{
"doc": {
"id": "DOCX",
"foo_i": 41
}
},
"delete": { "query":"foo_i:42" },
"delete": { "query":"foo_i:43" },
"add":{
"doc": {
"id": "DOCX",
"foo_i": { "inc" : "1" }
}
},
"delete": { "query":"foo_i:41" },
"delete": { "query":"foo_i:43" },
"add":{
"doc": {
"id": "DOCX",
"foo_i": { "inc" : "1" }
}
},
"delete": { "query":"foo_i:41" },
"delete": { "query":"foo_i:42" }
}'
# verify the updates were applied correctly and doc didn't get deleted by
mistake
curl 'http://localhost:8983/solr/techproducts/get?wt=json&id=DOCX'
{
"doc":
{
"id":"DOCX",
"foo_i":43,
"_version_":1555827152896655360}}
kill -9 PID # use whatever PID you get from "ps -ef | grep start.jar | grep
techproducts"
bin/solr start -s example/techproducts/solr/
# re-verify the updates were applied correctly during replay
curl 'http://localhost:8983/solr/techproducts/get?wt=json&id=DOCX'
{
"doc":
{
"id":"DOCX",
"foo_i":43,
"_version_":1555827152896655360}}
{noformat}
And here's a snippet of solr's logs during the replay on (second) startup...
{noformat}
WARN - 2017-01-07 01:27:45.408; [ x:techproducts]
org.apache.solr.update.UpdateLog$LogReplayer; Starting log replay
tlog{file=/home/hossman/lucene/dev/solr/example/techproducts/solr/techproducts/data/tlog/tlog.0000000000000000001
refcount=2} active=false starting pos=0
INFO - 2017-01-07 01:27:45.422; [ x:techproducts]
org.apache.solr.update.DirectUpdateHandler2; Reordered DBQs detected.
Update=add{flags=a,_version_=1555827152848420864,id=DOCX}
DBQs=[DBQ{version=1555827152905043968,q=foo_i:42},
DBQ{version=1555827152897703936,q=foo_i:41},
DBQ{version=1555827152894558208,q=foo_i:43},
DBQ{version=1555827152883023872,q=foo_i:41},
DBQ{version=1555827152872538112,q=foo_i:43},
DBQ{version=1555827152849469440,q=foo_i:42}]
INFO - 2017-01-07 01:27:45.434; [ x:techproducts]
org.apache.solr.core.SolrCore; [techproducts] webapp=null path=null
params={q=static+firstSearcher+warming+in+solrconfig.xml&distrib=false&event=firstSearcher}
hits=3 status=0 QTime=62
INFO - 2017-01-07 01:27:45.446; [ x:techproducts]
org.apache.solr.core.QuerySenderListener; QuerySenderListener done.
INFO - 2017-01-07 01:27:45.446; [ x:techproducts]
org.apache.solr.handler.component.SpellCheckComponent$SpellCheckerListener;
Loading spell index for spellchecker: default
INFO - 2017-01-07 01:27:45.446; [ x:techproducts]
org.apache.solr.handler.component.SpellCheckComponent$SpellCheckerListener;
Loading spell index for spellchecker: wordbreak
INFO - 2017-01-07 01:27:45.448; [ x:techproducts]
org.apache.solr.core.SolrCore; [techproducts] Registered new searcher
Searcher@1826a6c8[techproducts]
main{ExitableDirectoryReader(UninvertingDirectoryReader(Uninverting(_0(7.0.0):C32)))}
INFO - 2017-01-07 01:27:45.532; [ x:techproducts]
org.apache.solr.search.SolrIndexSearcher; Opening
[Searcher@5bd6b40f[techproducts] realtime]
INFO - 2017-01-07 01:27:45.536; [ x:techproducts]
org.apache.solr.update.DirectUpdateHandler2; Reordered DBQs detected.
Update=add{flags=a,_version_=1555827152881975296,id=DOCX}
DBQs=[DBQ{version=1555827152905043968,q=foo_i:42},
DBQ{version=1555827152897703936,q=foo_i:41},
DBQ{version=1555827152894558208,q=foo_i:43},
DBQ{version=1555827152883023872,q=foo_i:41}]
INFO - 2017-01-07 01:27:45.546; [ x:techproducts]
org.apache.solr.search.SolrIndexSearcher; Opening
[Searcher@667a386f[techproducts] realtime]
INFO - 2017-01-07 01:27:45.555; [ x:techproducts]
org.apache.solr.update.DirectUpdateHandler2; Reordered DBQs detected.
Update=add{flags=a,_version_=1555827152896655360,id=DOCX}
DBQs=[DBQ{version=1555827152905043968,q=foo_i:42},
DBQ{version=1555827152897703936,q=foo_i:41}]
INFO - 2017-01-07 01:27:45.573; [ x:techproducts]
org.apache.solr.search.SolrIndexSearcher; Opening
[Searcher@3ed8aa7e[techproducts] realtime]
INFO - 2017-01-07 01:27:45.579; [ x:techproducts]
org.apache.solr.update.DirectUpdateHandler2; start
commit{flags=2,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false}
INFO - 2017-01-07 01:27:45.580; [ x:techproducts]
org.apache.solr.update.SolrIndexWriter; Calling setCommitData with
IW:org.apache.solr.update.SolrIndexWriter@363e3fe6
INFO - 2017-01-07 01:27:45.615; [ x:techproducts]
org.apache.solr.search.SolrIndexSearcher; Opening
[Searcher@7af13db2[techproducts] main]
INFO - 2017-01-07 01:27:45.617; [ x:techproducts]
org.apache.solr.update.DirectUpdateHandler2; end_commit_flush
INFO - 2017-01-07 01:27:45.617; [ x:techproducts]
org.apache.solr.core.QuerySenderListener; QuerySenderListener sending requests
to Searcher@7af13db2[techproducts]
main{ExitableDirectoryReader(UninvertingDirectoryReader(Uninverting(_0(7.0.0):C32)
Uninverting(_3(7.0.0):C1)))}
INFO - 2017-01-07 01:27:45.617; [ x:techproducts]
org.apache.solr.core.QuerySenderListener; QuerySenderListener done.
INFO - 2017-01-07 01:27:45.619; [ x:techproducts]
org.apache.solr.core.SolrCore; [techproducts] Registered new searcher
Searcher@7af13db2[techproducts]
main{ExitableDirectoryReader(UninvertingDirectoryReader(Uninverting(_0(7.0.0):C32)
Uninverting(_3(7.0.0):C1)))}
INFO - 2017-01-07 01:27:45.620; [ x:techproducts]
org.apache.solr.update.processor.LogUpdateProcessorFactory$LogUpdateProcessor;
[techproducts] {add=[DOCX (1555827152848420864), DOCX (1555827152881975296),
DOCX (1555827152896655360)],deleteByQuery=foo_i:42
(-1555827152849469440),deleteByQuery=foo_i:43
(-1555827152872538112),deleteByQuery=foo_i:41
(-1555827152883023872),deleteByQuery=foo_i:43
(-1555827152894558208),deleteByQuery=foo_i:41
(-1555827152897703936),deleteByQuery=foo_i:42 (-1555827152905043968)} 0 212
WARN - 2017-01-07 01:27:45.620; [ x:techproducts]
org.apache.solr.update.UpdateLog$LogReplayer; Log replay finished.
recoveryInfo=RecoveryInfo{adds=3 deletes=0 deleteByQuery=6 errors=0
positionOfStart=0}
{noformat}
...notice how many times each of those DBQs was listed as being "Reordered"
(and thus preemptively applied) even though they are *also* applied as part of
the regular replay (note the INFO msg from LogUpdateProcessor)
> log replay redundently (pre-)applies DBQs as if they were out of order
> ----------------------------------------------------------------------
>
> Key: SOLR-9941
> URL: https://issues.apache.org/jira/browse/SOLR-9941
> Project: Solr
> Issue Type: Improvement
> Security Level: Public(Default Security Level. Issues are Public)
> Reporter: Hoss Man
>
> There's kind of an odd situation that arises when a Solr node starts up
> (after a crash) and tries to recover from it's tlog that causes deletes to be
> redundantly & excessively applied -- at a minimum it causes confusing really
> log messages....
> * {{UpdateLog.init(...)}} creates {{TransactionLog}} instances for the most
> recent log files found (based on numRecordsToKeep) and then builds a
> {{RecentUpdates}} instance from them
> * Delete entries from the {{RecentUpdates}} are used to populate 2 lists:
> ** {{deleteByQueries}}
> ** {{oldDeletes}} (for deleteById).
> * Then when {{UpdateLog.recoverFromLog}} is called a {{LogReplayer}} is used
> to replay any (uncommited) {{TransactionLog}} enteries
> ** during replay {{UpdateLog}} delegates to the UpdateRequestProcessorChain
> to for the various adds/deletes, etc...
> ** when an add makes it to {{RunUpdateProcessor}} it delegates to
> {{DirectUpdateHandler2}}, which (independent of the fact that we're in log
> replay) calls {{UpdateLog.getDBQNewer}} for every add, looking for any
> "Reordered" deletes that have a version greater then the add
> *** if it finds _any_ DBQs "newer" then the document being added, it does a
> low level {{IndexWriter.updateDocument}} and then immediately executes _all_
> the newer DBQs ... _once per add_
> ** these deletes are *also* still executed as part of the normal tlog replay,
> because they are in the tlog.
> Which means if you are recovering from a tlog with 90 addDocs, followed by 5
> DBQs, then *each* of those 5 DBQs will each be executed 91 times -- and for
> 90 of those executions, a DUH2 INFO log messages will say {{"Reordered DBQs
> detected. ..."}} even tough the only reason they are out of order is because
> Solr is deliberately applying them out of order.
> * At a minimum we should improve the log messages
> * Ideally we should stop (pre-emptively) applying these deletes during tlog
> replay.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]