Hello Mikhail, (resent with hopefully better representation of table) The Fieldtype is plong. Hopefully this helps us find the problem.
As I can not send you screenshots I will attempt to send you an ascii representation of what I see under the Schema endpoint in the web admin view. Field task_coopProcessId Type Plong Field: task_coopProcessId Field-Type:org.apache.solr.schema.LongPointField +------------+---------+--------------+------------+------------------------------------+-------------------+ | Flags: | Indexed | UnInvertible | Omit Norms | Omit Terms Frequencies & Positions | Sort Missing Last | +------------+---------+--------------+------------+------------------------------------+-------------------+ | Properties | X | X | X | X | X | | Schema | X | X | X | X | X | +------------+---------+--------------+------------+------------------------------------+-------------------+ (view Text in monospaced font, to not get confused about the table) Index Analyzer: org.apache.solr.schema.FieldType$DefaultAnalyzer Query Analyzer: org.apache.solr.schema.FieldType$DefaultAnalyzer with kind regards, Dario -----Ursprüngliche Nachricht----- Von: Mikhail Khludnev <m...@apache.org> Gesendet: Montag, 22. April 12024 11:34 An: users@solr.apache.org Betreff: Re: Wrong documents in Response > > "querystring": "task_coopProcessId:20021454", > "parsedquery": "(task_coopProcessId:[20021454 TO > 20021454])", What's the field type here? How string was parsed into range? I suppose it may be just a StrField. "task_46916": { > "match": false, > "value": 0, > "description": > "task_coopProcessId:[20021454 TO 20021454] doesn't match id 30330" Also, I don't know how non-matching docs may appear in the result. On Mon, Apr 22, 2024 at 11:16?AM <dario.v...@coop.ch> wrote: > Sure, here it is directly in mail. Hopefully it does not get chopped. > > { > "responseHeader": { > "zkConnected": true, > "status": 0, > "QTime": 29, > "params": { > "q": "task_coopProcessId:20021454", > "indent": "true", > "fl": "task_coopProcessId", > "q.op": "OR", > "debug.explain.structured": "true", > "debugQuery": "true", > "useParams": "" > } > }, > "response": { > <same as in response without debug> > }, > "debug": { > "track": { > "rid": > "<insert-project-name>-solrcloud-1.<insert-project-name>-solrcloud-headless.<insert-project-name>-14886", > "EXECUTE_QUERY": { > > "https://<insert-project-name>-solrcloud-1.<insert-project-name>-solrcloud-headless.<insert-project-name>:8983/solr/workflow_shard1_replica_n4/": > { > "QTime": "0", > "ElapsedTime": "11", > "RequestPurpose": > "GET_TOP_IDS,SET_TERM_STATS", > "NumFound": "0", > "Response": > "{responseHeader={zkConnected=true, status=0, QTime=0, params={df=_text_, > distrib=false, debug=[false, timing, track], fl=[id, score], > shards.purpose=16388, start=0, fsv=true, q.op=OR, rows=10, > rid=<insert-project-name>-solrcloud-1.<insert-project-name>-solrcloud-headless.<insert-project-name>-14886, > debug.explain.structured=true, version=2, q=task_coopProcessId:20021454, > omitHeader=false, requestPurpose=GET_TOP_IDS,SET_TERM_STATS, > NOW=1713520858995, isShard=true, wt=javabin, debugQuery=false, > useParams=}}, > response={numFound=0,numFoundExact=true,start=0,maxScore=0.0,docs=[]}, > sort_values={}, debug={timing={time=0.0, prepare={time=0.0, > query={time=0.0}, facet={time=0.0}, facet_module={time=0.0}, > mlt={time=0.0}, highlight={time=0.0}, stats={time=0.0}, expand={time=0.0}, > terms={time=0.0}, debug={time=0.0}}, process={time=0.0, query={time=0.0}, > facet={time=0.0}, facet_module={time=0.0}, mlt={time=0.0}, > highlight={time=0.0}, stats={time=0.0}, expand={time=0.0}, > terms={time=0.0}, debug={time=0.0}}}}}" > }, > > "https://<insert-project-name>-solrcloud-2.<insert-project-name>-solrcloud-headless.<insert-project-name>:8983/solr/workflow_shard2_replica_n2/": > { > "QTime": "0", > "ElapsedTime": "13", > "RequestPurpose": > "GET_TOP_IDS,SET_TERM_STATS", > "NumFound": "0", > "Response": > "{responseHeader={zkConnected=true, status=0, QTime=0, params={df=_text_, > distrib=false, debug=[false, timing, track], fl=[id, score], > shards.purpose=16388, start=0, fsv=true, q.op=OR, rows=10, > rid=<insert-project-name>-solrcloud-1.<insert-project-name>-solrcloud-headless.<insert-project-name>-14886, > debug.explain.structured=true, version=2, q=task_coopProcessId:20021454, > omitHeader=false, requestPurpose=GET_TOP_IDS,SET_TERM_STATS, > NOW=1713520858995, isShard=true, wt=javabin, debugQuery=false, > useParams=}}, > response={numFound=0,numFoundExact=true,start=0,maxScore=0.0,docs=[]}, > sort_values={}, debug={timing={time=0.0, prepare={time=0.0, > query={time=0.0}, facet={time=0.0}, facet_module={time=0.0}, > mlt={time=0.0}, highlight={time=0.0}, stats={time=0.0}, expand={time=0.0}, > terms={time=0.0}, debug={time=0.0}}, process={time=0.0, query={time=0.0}, > facet={time=0.0}, facet_module={time=0.0}, mlt={time=0.0}, > highlight={time=0.0}, stats={time=0.0}, expand={time=0.0}, > terms={time=0.0}, debug={time=0.0}}}}}" > }, > > "https://<insert-project-name>-solrcloud-0.<insert-project-name>-solrcloud-headless.<insert-project-name>:8983/solr/workflow_shard3_replica_n1/": > { > "QTime": "0", > "ElapsedTime": "17", > "RequestPurpose": > "GET_TOP_IDS,SET_TERM_STATS", > "NumFound": "4", > "Response": > "{responseHeader={zkConnected=true, status=0, QTime=0, params={df=_text_, > distrib=false, debug=[false, timing, track], fl=[id, score], > shards.purpose=16388, start=0, fsv=true, q.op=OR, rows=10, > rid=<insert-project-name>-solrcloud-1.<insert-project-name>-solrcloud-headless.<insert-project-name>-14886, > debug.explain.structured=true, version=2, q=task_coopProcessId:20021454, > omitHeader=false, requestPurpose=GET_TOP_IDS,SET_TERM_STATS, > NOW=1713520858995, isShard=true, wt=javabin, debugQuery=false, > useParams=}}, > response={numFound=4,numFoundExact=true,start=0,maxScore=1.0,docs=[SolrDocument{id=task_46914, > score=1.0}, SolrDocument{id=task_46915, score=1.0}, > SolrDocument{id=task_46916, score=1.0}, SolrDocument{id=task_46917, > score=1.0}]}, sort_values={}, debug={timing={time=0.0, prepare={time=0.0, > query={time=0.0}, facet={time=0.0}, facet_module={time=0.0}, > mlt={time=0.0}, highlight={time=0.0}, stats={time=0.0}, expand={time=0.0}, > terms={time=0.0}, debug={time=0.0}}, process={time=0.0, query={time=0.0}, > facet={time=0.0}, facet_module={time=0.0}, mlt={time=0.0}, > highlight={time=0.0}, stats={time=0.0}, expand={time=0.0}, > terms={time=0.0}, debug={time=0.0}}}}}" > } > }, > "GET_FIELDS": { > > "https://<insert-project-name>-solrcloud-0.<insert-project-name>-solrcloud-headless.<insert-project-name>:8983/solr/workflow_shard3_replica_n1/": > { > "QTime": "1", > "ElapsedTime": "4", > "RequestPurpose": > "GET_FIELDS,GET_DEBUG,SET_TERM_STATS", > "NumFound": "4", > "Response": > "{responseHeader={zkConnected=true, status=0, QTime=1, params={df=_text_, > distrib=false, debug=[timing, track], fl=[task_coopProcessId, id], > shards.purpose=16704, q.op=OR, rows=10, > rid=<insert-project-name>-solrcloud-1.<insert-project-name>-solrcloud-headless.<insert-project-name>-14886, > debug.explain.structured=true, version=2, q=task_coopProcessId:20021454, > omitHeader=false, requestPurpose=GET_FIELDS,GET_DEBUG,SET_TERM_STATS, > NOW=1713520858995, ids=task_46915,task_46914,task_46917,task_46916, > isShard=true, wt=javabin, debugQuery=true, useParams=}}, > response={numFound=4,numFoundExact=true,start=0,docs=[SolrDocument{task_coopProcessId=20021454}, > SolrDocument{task_coopProcessId=2008387}, > SolrDocument{task_coopProcessId=20021454}, > SolrDocument{task_coopProcessId=2008403}]}, > debug={rawquerystring=task_coopProcessId:20021454, > querystring=task_coopProcessId:20021454, > parsedquery=(task_coopProcessId:[20021454 TO 20021454]), > parsedquery_toString=task_coopProcessId:[20021454 TO 20021454], > explain={task_46915={match=true, value=1.0, > description=task_coopProcessId:[20021454 TO 20021454]}, > task_46914={match=false, value=0.0, > description=task_coopProcessId:[20021454 TO 20021454] doesn't match id > 30378}, task_46917={match=true, value=1.0, > description=task_coopProcessId:[20021454 TO 20021454]}, > task_46916={match=false, value=0.0, > description=task_coopProcessId:[20021454 TO 20021454] doesn't match id > 30330}}, QParser=LuceneQParser, timing={time=1.0, prepare={time=0.0, > query={time=0.0}, facet={time=0.0}, facet_module={time=0.0}, > mlt={time=0.0}, highlight={time=0.0}, stats={time=0.0}, expand={time=0.0}, > terms={time=0.0}, debug={time=0.0}}, process={time=0.0, query={time=0.0}, > facet={time=0.0}, facet_module={time=0.0}, mlt={time=0.0}, > highlight={time=0.0}, stats={time=0.0}, expand={time=0.0}, > terms={time=0.0}, debug={time=0.0}}}}}" > } > } > }, > "timing": { > "time": 1, > "prepare": { > "time": 0, > "query": { > "time": 0 > }, > "facet": { > "time": 0 > }, > "facet_module": { > "time": 0 > }, > "mlt": { > "time": 0 > }, > "highlight": { > "time": 0 > }, > "stats": { > "time": 0 > }, > "expand": { > "time": 0 > }, > "terms": { > "time": 0 > }, > "debug": { > "time": 0 > } > }, > "process": { > "time": 0, > "query": { > "time": 0 > }, > "facet": { > "time": 0 > }, > "facet_module": { > "time": 0 > }, > "mlt": { > "time": 0 > }, > "highlight": { > "time": 0 > }, > "stats": { > "time": 0 > }, > "expand": { > "time": 0 > }, > "terms": { > "time": 0 > }, > "debug": { > "time": 0 > } > } > }, > "rawquerystring": "task_coopProcessId:20021454", > "querystring": "task_coopProcessId:20021454", > "parsedquery": "(task_coopProcessId:[20021454 TO > 20021454])", > "parsedquery_toString": "task_coopProcessId:[20021454 TO > 20021454]", > "QParser": "LuceneQParser", > "explain": { > "task_46914": { > "match": false, > "value": 0, > "description": > "task_coopProcessId:[20021454 TO 20021454] doesn't match id 30378" > }, > "task_46915": { > "match": true, > "value": 1, > "description": > "task_coopProcessId:[20021454 TO 20021454]" > }, > "task_46916": { > "match": false, > "value": 0, > "description": > "task_coopProcessId:[20021454 TO 20021454] doesn't match id 30330" > }, > "task_46917": { > "match": true, > "value": 1, > "description": > "task_coopProcessId:[20021454 TO 20021454]" > } > } > } > } > > > -----Ursprüngliche Nachricht----- > Von: Mikhail Khludnev <m...@apache.org> > Gesendet: Samstag, 20. April 12024 11:36 > An: users@solr.apache.org > Cc: solr-u...@lucene.apache.org > Betreff: Re: Wrong documents in Response > > CAUTION: This is an external email from sender 'Mikhail Khludnev < > m...@apache.org>' ('users-return-164293-Dario.Viva=coop...@solr.apache.org'). > Do not click any links or open any attachments unless you trust the sender > and know the content is safe. > > > > Hello Dario. > Mailing list chopped attachment, but looking into debugQuery is what we > need here. > > On Fri, Apr 19, 2024 at 1:41?PM <dario.v...@coop.ch> wrote: > > > Hello All, > > > > > > > > We have a relatively new Solr Instance: > > > > solr-spec: 9.5.0 > > > > solr-impl: 9.5.0 cdd27dd15c3a6574032e9b1b92b148ab4e383599 - gerlowskija - > > 2024-02-07 15:10:39 > > > > > > > > lucene-spec: 9.9.2 > > > > lucene-impl: 9.9.2 a2939784c4ca60bc28bf488b5479c02fc2e5e22c - 2024-01-25 > > 09:51:09 > > > > > > > > JVM Runtime: Eclipse Adoptium OpenJDK 64-Bit Server VM 17.0.10 17.0.10+7 > > > > > > > > We run the solr instance in a Kubernetes cluster in gcp. > > > > > > > > We have two collections but only documents in one of them right now. We > > have indexed ~70,000 tasks (one of the types of documents we index) on > one > > of the collection. In total there are ~100,000 documents in this > > collection. > > > > Note that on production we still use an older solr version (8.11.2) with > > ~5,000,000 tasks and the fallowing problem does not appear there. > > > > > > > > The collection are all set um with the _default config and only use 1 > > shard each. autoAddReplicas is also configured to be false. The > > replicationFactor is also 1. Even the maxShardsPerNode is 1. > > > > Or at least that's how we configured the collections. In the debugged > > response you will see that somehow multiple shards are at play. > > > > > > > > Now the problem: > > > > Every Task has a parent id - we call it processId. We use this processId > > to find all the tasks that belong to one process. > > > > By searching for this processId we expect to find all the tasks that > > belong to the corresponding process. > > > > > > > > For example, we have a process with the processId 20021454 (this is the > > real processId, I have chosen to show you the real number, because maybe > > this number is forbidden in solr?!). > > > > One would expect to find all the tasks that belong to this process when > > using this query: "task_coopProcessId:20021454". > > > > We know for a fact that this process contains exactly four tasks. That's > > also what solr returns - four tasks. > > > > But two of the tasks don't belong to the correct process. > > > > Below is the response we get from solr (to keep the response short, I > have > > included the fl parameter, to only show the important info for this > problem > > description). > > > > I have also included the result when showing debug info as an attachment > > (example.json). You will need to mentally replace <insert-project-name> > > with a real project name, that I am not going to name here. > > > > > > > > { > > > > "responseHeader": { > > > > "zkConnected": true, > > > > "status": 0, > > > > "QTime": 9, > > > > "params": { > > > > "q": "task_coopProcessId:20021454", > > > > "indent": "true", > > > > "fl": "task_coopProcessId", > > > > "q.op": "OR", > > > > "useParams": "" > > > > } > > > > }, > > > > "response": { > > > > "numFound": 4, > > > > "start": 0, > > > > "maxScore": 1, > > > > "numFoundExact": true, > > > > "docs": [ > > > > { > > > > "task_coopProcessId": 2008387 > > > > }, > > > > { > > > > "task_coopProcessId": 20021454 > > > > }, > > > > { > > > > "task_coopProcessId": 2008403 > > > > }, > > > > { > > > > "task_coopProcessId": 20021454 > > > > } > > > > ] > > > > } > > > > } > > > > > > > > With kind regards, > > > > > > > > Dario Viva > > > > > > > > > > > > > -- > Sincerely yours > Mikhail Khludnev > -- Sincerely yours Mikhail Khludnev